home *** CD-ROM | disk | FTP | other *** search
- Path: bloom-beacon.mit.edu!hookup!europa.eng.gtefsd.com!howland.reston.ans.net!EU.net!julienas!chorus!chorus.fr
- From: jloup@chorus.fr (Jean-loup Gailly)
- Newsgroups: comp.compression,comp.compression.research,news.answers,comp.answers
- Subject: comp.compression Frequently Asked Questions (part 1/3)
- Summary: *** READ THIS BEFORE POSTING ***
- Keywords: data compression, FAQ
- Message-ID: <compr1_17apr94@chorus.fr>
- Date: 17 Apr 94 12:28:43 GMT
- Expires: 30 May 94 16:17:20 GMT
- Sender: news@chorus.chorus.fr
- Reply-To: jloup@chorus.fr
- Followup-To: comp.compression
- Lines: 2643
- Approved: news-answers-request@MIT.Edu
- Supersedes: <compr1_18mar94@chorus.fr>
- Xref: bloom-beacon.mit.edu comp.compression:7402 comp.compression.research:1152 news.answers:18171 comp.answers:4937
-
- Archive-name: compression-faq/part1
- Last-modified: April 17th, 1994
-
- "I've already explained this once, but repetition is
- the very soul of the net." (from alt.config)
-
- This file is part 1 of a set of Frequently Asked Questions (FAQ) for
- the groups comp.compression and comp.compression.research. If you
- can't find part 2 or 3, see item 53 below. A copy of this FAQ is available
- by ftp in rtfm.mit.edu:/pub/usenet/news.answers/compression-faq/part[1-3].
-
- Certain questions get asked time and again, and this is an attempt to
- reduce the bandwidth taken up by these posts and their associated
- replies. If you have a question, *please* check this file before you
- post. It may save a lot of peoples time.
-
- If you have not already read the overall Usenet introductory material
- posted to "news.announce.newusers", please do. It is also available by
- ftp in garbo.uwasa.fi:/pc/doc-net/usenews.zip (see item 2 below about .zip).
-
- If you don't want to see this FAQ regularly, please add the subject
- line to your kill file. (If you don't know what a kill file is, get
- by ftp the file rtfm.mit.edu:/pub/usenet/news.answers/killfile-faq.)
- If you have corrections or suggestions for this FAQ, send them to
- Jean-loup Gailly <jloup@chorus.fr>. Thank you.
-
- Part 1 is oriented towards practical usage of compression programs.
- Part 2 is more intended for people who want to know how compression works.
- Part 3 is a long list of image compression hardware.
-
- Main changes relative to the previous version:
-
- - added .hap extension (item 2)
- - fix several typos in file names (item 2)
- - new version of unp (item 2)
- - updated pointer to ivs (items 15 and 20)
- - new ftp sites for medical images (item 55)
-
- Contents
- ========
-
- General questions:
-
- [1] What are these newsgroups about?
- [2] What is this .xxx file type?
- Where can I find the corresponding compression program?
- [3] What is the latest pkzip version?
- [4] What is an archiver?
- [5] What is the best general purpose compression program?
- [7] Which books should I read?
- [8] What about patents on data compression algorithms?
- [9] The WEB 16:1 compressor.
- [11] What is the V.42bis standard?
- [12] I need source for the winners of the Dr Dobbs compression contest
- [13] I need source for arithmetic coding
-
- Image and audio compression:
-
- [15] Where can I get image compression programs?
- [16] What is the state of the art in lossless image compression?
- [17] What is the state of fractal compression?
- [18] I need specs and source for TIFF and CCITT group 4 Fax.
- [19] What is JPEG?
- [20] I am looking for source of an H.261 codec and MPEG
- [25] Fast DCT (Discrete Cosine Transform) algorithms
- [26] Are there algorithms and standards for audio compression?
-
- Common problems:
-
- [30] My archive is corrupted!
- [31] pkunzip reports a CRC error!
- [32] VMS zip is not compatible with pkzip!
- [33] I have a problem with Stacker or DoubleSpace!
-
- Questions which do not really belong to comp.compression:
-
- [50] What is this 'tar' compression program?
- [51] I need a CRC algorithm
- [52] What about those people who continue to ask frequently asked questions?
- [53] Where are FAQ lists archived?
- [54] I need specs for graphics formats
- [55] Where can I find Lenna and other images?
- [56] I am looking for a message digest algorithm
-
-
- Part 2: (Long) introductions to data compression techniques
-
- [70] Introduction to data compression (long)
- Huffman and Related Compression Techniques
- Arithmetic Coding
- Substitutional Compressors
- The LZ78 family of compressors
- The LZ77 family of compressors
-
- [71] Introduction to MPEG (long)
- What is MPEG?
- Does it have anything to do with JPEG?
- Then what's JBIG and MHEG?
- What has MPEG accomplished?
- So how does MPEG I work?
- What about the audio compression?
- So how much does it compress?
- What's phase II?
- When will all this be finished?
- How do I join MPEG?
- How do I get the documents, like the MPEG I draft?
-
- [72] What is wavelet theory?
- [73] What is the theoretical compression limit?
- [74] Introduction to JBIG
- [75] Introduction to JPEG
- [76] What is Vector Quantization?
- [77] Introduction to Fractal compression
-
-
- Part 3: (Long) list of image compression hardware
-
- [85] Image compression hardware
- [99] Acknowledgments
-
-
- Search for "Subject: [#]" to get to question number # quickly. Some news
- readers can also take advantage of the message digest format used here.
-
- If you know very little about data compression, read question 70 in
- part 2 first.
-
- ------------------------------------------------------------------------------
-
- Subject: [1] What are these newsgroups about?
-
-
- comp.compression is the place to discuss about data compression, both
- lossless (for text or data) and lossy (for images, sound, etc..).
- comp.compression.research was created later to provide a forum for
- current research on data compression and data compression algorithms;
- this group is now moderated. If you are not experienced in data compression,
- please post in comp.compression only.
-
- There is no archive for comp.compression, the volume is too high.
- (If you know an ftp site archiving all of comp.compression, tell me
- at jloup@chorus.fr).
-
- If you only want to find a particular compression program for a
- particular operating system, please read first this FAQ and the
- article "How to find sources" which is regularly posted in
- news.answers.
-
- If you can't resist posting such a request, other groups are probably
- more appropriate (comp.binaries.ibm.pc.wanted, comp.os.msdos.apps,
- comp.sources.wanted, comp.sys.mac.wanted, comp.archives.msdos.d, comp.dsp,
- alt.graphics.pixutils). Please post your request in comp.compression
- only as a last resource.
-
- If your question is about graphics only (no compression), please
- post to comp.graphics, *after* reading the comp.graphics FAQ (see
- item 54 below). For some unknown reason, many questions about
- graphics are incorrectly posted to comp.compression.
- For questions related to audio compression, check also comp.dsp.
-
- Please do not post any program in binary form to comp.compression.
- Very short sources can be posted, but long sources should be be posted
- to the specialized source groups, such as comp.sources.* or alt.sources.
- If the program is already available by ftp, just give the name of the
- ftp site and the full path name of the file.
-
- As for any newsgroups, do not post the same message separately to
- comp.compression and comp.compression.research.
-
- ------------------------------------------------------------------------------
-
- Subject: [2] What is this .xxx file type?
- Where can I find the corresponding compression program?
-
-
- All the programs mentioned in this section are lossless.
- For most programs, one US and one European ftp site are given.
- (oak.oakland.edu: 141.210.10.117, garbo.uwasa.fi: 128.214.87.1)
- Many other sites (in particular wuarchive.wustl.edu: 128.152.135.4)
- have the same programs.
-
- To keep this list to a reasonable size, many programs are not
- mentioned here. Additional information can be found in the file
- ftp.cso.uiuc.edu:/doc/pcnet/compression [128.174.5.61] maintained by
- David Lemson (lemson@uiuc.edu). When several programs can handle
- the same archive format, only one of them is given.
-
- Sources for additional lossless data compressors can be found in
- garbo.uwasa.fi:/pc/programming/lds_11.zip and
- oak.oakland.edu:/pub/msdos/archivers/lz-comp2.zip.
- Sources in Pascal are in garbo.uwasa.fi:/pc/turbopas/preskit2.zip.
-
- For Macintosh programs, look on sumex-aim.stanford.edu:/info-mac [36.44.0.6].
- For VM/CMS, look on vmd.cso.uiuc.edu:/public.477 [128.174.5.98].
- For Atari, look on atari.archive.umich.edu [141.211.165.41]
- For Amiga, look on ftp.cso.uiuc.edu:/pub/amiga [128.174.5.59]
-
- If you don't know how to use ftp or don't have ftp access, read the
- article "How to find sources" which is regularly posted in news.answers.
-
- If you can't find a program given below, it is likely that a newer
- version exists in the same directory. (Tell me <jloup@chorus.fr>)
-
- A very short description of the compression algorithm is given for
- most programs. For the meaning of LZ77, LZ78 and LZW, see question 70
- in part 2 of the FAQ.) If you are looking for the file format of a
- specific compression program, get the sources of the decompressor.
-
- ext: produced by or read by
-
- .arc: arc, pkarc for MSDOS. (LZW algorithm)
- wuarchive.wustl.edu:/mirrors/msdos/starter/pk361.exe
- garbo.uwasa.fi:/pc/arcers/pk361.exe
-
- arc for Unix
- wuarchive.wustl.edu:/mirrors/misc/unix/arc521e.tar-z
- garbo.uwasa.fi:/unix/arcers/arc.tar.Z
- Contact: Howard Chu <hyc@umix.cc.umich.edu>
-
- arc for VMS
- wuarchive.wustl.edu:/packages/compression/vax-vms/arc.exe
-
- arcmac for Mac
- mac.archive.umich.edu:/mac/utilities/compressionapps/arcmac.hqx
-
- arc for Amiga
- ftp.funet.fi:pub/amiga/fish/001-100/ff070/arc.lha
-
- .arj: arj for MSDOS (LZ77 with hashing, plus secondary static Huffman
- encoding on a block basis)
- Contact: Robert K Jung <robjung@world.std.com>
- wuarchive.wustl.edu:/mirrors/msdos/archivers/arj241a.exe
- garbo.uwasa.fi:/pc/arcers/arj241a.exe
-
- unarj for Unix. Decompresses only. (There is no arj compressor for Unix.
- Don't post a request.)
- wuarchive.wustl.edu:/mirrors/misc/unix/unarj241.tar-z
- garbo.uwasa.fi:/unix/arcers/unarj241.tar.Z
-
- unarj for Mac
- mac.archive.umich.edu:/mac/util/compression/unarjmac.cpt.hqx
-
- unarj for Amiga
- ftp.funet.fi:pub/amiga/utilities/archivers/unarj-0.5.lha
-
- .bck: VMS BACKUP. BACKUP is *not* a compression program. Do "help backup".
-
- .cpt: Compact Pro for Mac
- sumex-aim.stanford.edu:/info-mac/util/compact-pro-133.hqx [36.44.0.6]
-
- For Unix:
- sumex-aim.stanford.edu:/info-mac/unix/macutil-20b1.shar
- ftp.cwi.nl:/pub/macutil2.0b3.shar.Z
-
- .exe: self-extracting MSDOS executable (creates files on disk when run)
- Run the file, or try unzip, lha or arj on it.
-
- .exe: compressed MSDOS executable (decompresses itself in memory then runs
- the decompressed code). To get the original uncompressed .exe:
- oak.oakland.edu:/pub/msdos/execomp/unp330.zip
- To create such files:
- oak.oakland.edu:/pub/msdos/execomp/lzexe91e.zip
- nic.funet.fi:/pub/msdos/windows/util/winlite1.zip (for Windows exe)
-
- .gif: gif files are images compressed with the LZW algorithm. See the
- comp.graphics FAQ list for programs manipulating .gif files. See
- suffix .Z below for source of LZW.
-
- .gz, .z: gzip (or pack, see .z below). gzip uses the same algorithm as
- zip 2.0 (see below); it can also extract packed and compressed files.
- For Unix, MSDOS, OS/2, VMS, Atari, Amiga, Primos:
- prep.ai.mit.edu:/pub/gnu/gzip-1.2.4.tar (or .shar or .tar.gz : source)
- prep.ai.mit.edu:/pub/gnu/gzip-msdos-1.2.4.exe (MSDOS, lha self-extract)
- garbo.uwasa.fi:/unix/arcers/gzip-1.2.4.tar.Z (source)
- garbo.uwasa.fi:/pc/arcers/gzip124.zip (MSDOS exe)
- ftp.uu.net:/pub/archiving/zip/VMS/gzip123x.exe (VMS exe)
- mac.archive.umich.edu:/mac/util/compression/macgzip0.2.cpt.hqx (Mac)
- mac.archive.umich.edu:/mac/development/source/macgzip0.2src.cpt.hqx
-
- .ha: ha 0.98 for MSDOS (improved PPMC - 4th order Markov modeling)
- garbo.uwasa.fi:/pc/arcers/ha098.zip
-
- .hap: Hamarsoft HAP 3.00 archiver. Contact: harald.feldmann@almac.co.uk
- garbo.uwasa.fi:/pc/arcers/hap300re.zip
-
- .hqx: Macintosh BinHex format.. (BinHex is *not* a compression program,
- it is similar to uuencode but handles multiple forks.)
- for Mac:
- mac.archive.umich.edu:/mac/utilities/compressionapps/binhex4.0.bin
-
- for Unix:
- sumex-aim.stanford.edu:/info-mac/cmp/mcvert-212.shar [36.44.0.6]
-
- for MSDOS:
- wuarchive.wustl.edu:/mirrors/msdos/xbin23.zip
-
- .lha:
- .lzh: lha for MSDOS (LZ77 with a trie data structure, plus secondary static
- Huffman coding on a block basis)
- oak.oakland.edu:/pub/msdos/archiver/lha213.exe (exe)
- oak.oakland.edu:/pub/msdos/archiver/lha211sr.zip (sources)
- garbo.uwasa.fi:/pc/arcers/lha255b.exe
-
- lharc for Unix. (LZ77 with hash table and binary trees, plus secondary
- Huffman coding)
- Warning: lharc can extract .lzh files created by
- lharc 1.xx but not those created by lha. See lha for Unix below.
- wuarchive.wustl.edu:/mirrors/misc/unix/lharc102a.tar-z
- garbo.uwasa.fi:/unix/arcers/lha101u.tar.Z
-
- lharc for VMS. Same warning as for Unix lharc.
- wuarchive.wustl.edu:/packages/compression/vax-vms/lharc.exe
-
- lha for Unix. Warning: all doc is in Japanese.
- wuarchive.wustl.edu:/mirrors/misc/unix/lha101u.tar-z
- garbo.uwasa.fi:/unix/arcers/lha-1.00.tar.Z
- Contact: lha-admin@oki.co.jp or oki@wbg.telcom.oki.co.jp
-
- lha for Mac
- mac.archive.umich.edu:/mac/utilities/compressionapps/maclha2.0.cpt.hqx
-
- lha for Amiga
- ftp.funet.fi:pub/amiga/utilities/archivers/LhA_e138.run
-
-
- .pak: pak for MSDOS (LZW algorithm)
- wuarchive.wustl.edu:/mirrors/msdos/archivers/pak251.exe
- garbo.uwasa.fi:/pc/arcers/pak251.exe
-
- .pit: PackIt (Macintosh)
- for Mac:
- sumex-aim.stanford.edu:/info-mac/util/stuffit-151.hqx [36.44.0.6]
-
- for Unix:
- sumex-aim.stanford.edu:/info-mac/unix/mcvert-165.shar [36.44.0.6]
-
- .pp: PowerPacker (Amiga)
- ftp.funet.fi:pub/amiga/fish/501-600/ff561/PPLib.lha
-
- .sea: self-extracting archive (Macintosh)
- Run the file to extract it. The self-extraction code can be
- removed with:
- mac.archive.umich.edu:/mac/utilities/compressionapps/desea1.11.cpt.hqx
-
- .sdn: used by the Shareware Distribution Network.
- Try the decompressors for .pak or .arj (see above)
-
- .shar: Shell archive. This is not a compression program. Use "sh foo.shar"
- to extract.
-
- .sit: Stuffit for Macintosh
- for Mac:
- sumex-aim.stanford.edu:/info-mac/util/stuffit-lite-30.hqx [36.44.0.6]
-
- for Unix:
- sumex-aim.stanford.edu:/info-mac/cmp/unsit-15-unix.shar [36.44.0.6]
-
- for Amiga:
- ftp.funet.fi:pub/amiga/utilities/archivers/unsit-1.5c2.lha
-
- for MSDOS:
- garbo.uwasa.fi:/pc/arcers/unsit30.zip
-
- .?q?: Squeeze for MSDOS (do not confuse with other 'squeeze' below).
- Static Huffman coding.
- oak.oakland.edu:/pub/msdos/starter/sqpc12a.com (squeeze)
- oak.oakland.edu:/pub/msdos/starter/nusq110.com (unsqueeze)
-
- .sqz: Squeeze for MSDOS (do not confuse with other 'squeeze' above)
- LZ77 with hashing.
- wuarchive.wustl.edu:/mirrors/msdos/archivers/sqz1083e.exe
- garbo.uwasa.fi:/pc/arcers/sqz1083e.exe
-
- .tar: tar is *not* a compression program. However, to be kind for you:
- for MSDOS
- wuarchive.wustl.edu:/mirrors/msdos/starter/tarread.exe
- garbo.uwasa.fi:/pc/unix/tar4dos.zoo
-
- for Unix
- tar (you have it already. To extract: tar xvf file.tar)
-
- for VMS
- wuarchive.wustl.edu:/packages/compression/vax-vms/tar.exe
-
- for Macintosh
- sumex-aim.stanford.edu:/info-mac/util/tar-30.hqx
-
- for Amiga:
- ftp.funet.fi:pub/amiga/fish/401-500/ff445/Tar.lha
-
- .tar.Z, .tar-z, .taz: tar + compress
- For Unix: zcat file.tar.Z | tar xvf -
- with GNU tar: tar xvzf file.tar.Z
- Other OS: first uncompress (see .Z below) then untar (see .tar above)
-
- .tar.gz, tar.z, .tgz: tar + gzip
- For Unix: gzip -cd file.tar.gz | tar xvf -
- with GNU tar: tar xvzf file.tar.gz
- Other OS: first uncompress (see .gz above) then untar (see .tar above)
-
- .td0: (compressed MS-DOS floppy image produced by TeleDisk)
- oak.oakland.edu:/pub/msdos/diskutil/teled212.zip
-
- .uc2: UC2 for MSDOS and OS/2. (LZ77 with secondary static Huffman encoding on
- a block basis, and dynamic dictionaries shared among files.)
- Contact: desk@aip.nl
- garbo.uwasa.fi:/pc/arcers/uc2ins.exe
-
- .z: pack or gzip (see .gz above). pack uses static Huffman coding.
- To extract, see .gz above.
-
- .zip: pkzip 1.10 for MSDOS. (LZ77 with hashing, plus secondary static
- Shannon-Fano encoding on whole file)
- Contact: pkware.inc@mixcom.com
- wuarchive.wustl.edu:/mirrors/msdos/zip/pkz110eu.exe.
- garbo.uwasa.fi:/pc/goldies/pkz110eu.exe.
- Note: pkz110eu.exe is an 'export' version without encryption.
-
- zip 1.1 for Unix, MSDOS, VMS, OS/2, ... (compatible with pkzip 1.10.
- For corresponding unzip, see unzip 5.1 below).
- ftp.uu.net:/pub/archiving/zip/zip11.zip
-
- arcutil 2.0 for VM/CMS (unzip only, not yet compatible with pkzip 2.04)
- vmd.cso.uiuc.edu:/public.477/arcutil.* [128.174.5.98].
-
- pkzip 2.04g for MSDOS. (LZ77 with hashing, plus secondary static
- Huffman coding on a block basis)
- oak.oakland.edu:/pub/msdos/zip/pkz204g.exe
- garbo.uwasa.fi:/pc/arcers/pkz204g.exe
-
- zip 2.0.1 and unzip 5.1 for Unix, MSDOS, VMS, OS/2, Amiga, ...
- Compatible with pkzip 2.04g (LZ77 with hashing, plus secondary static
- Huffman coding on a block basis). Contact: zip-bugs@wkuvx1.wku.edu
- (On SGI, do not confuse with the editor also named 'zip'.)
-
- ftp.uu.net:/pub/archiving/zip/zip201.zip (source)
- ftp.uu.net:/pub/archiving/zip/unzip51.tar.Z (source)
- ftp.uu.net:/pub/archiving/zip/MSDOS/zip20x.zip (MSDOS exe)
- ftp.uu.net:/pub/archiving/zip/MSDOS/unzip51x.exe (MSDOS exe)
- ftp.uu.net:/pub/archiving/zip/VMS/unz50p1x.exe (Vax/VMS exe)
- ftp.uu.net:/pub/archiving/zip/VMS/zip20x-vms.zip (Vax/VMS exe)
- ftp.uu.net:/pub/archiving/zip/AMIGA/unzip51x.* (Amiga exe)
- ftp.uu.net:/pub/archiving/zip/AMIGA/zip201x.zip (Amiga exe)
- ftp.uu.net:/pub/archiving/zip/OS2*/* (OS/2 exe 16&32 bit)
- ftp.uu.net:/pub/archiving/zip/zcrypt21.zip (encryption source)
- (Non US residents must get the crypt versions from garbo,see below)
-
- garbo.uwasa.fi:/unix/arcers/zip201.zip (source)
- garbo.uwasa.fi:/unix/arcers/unzip51.tar.Z (source)
- garbo.uwasa.fi:/pc/arcers/zip20x.zip (MSDOS exe)
- garbo.uwasa.fi:/pc/arcers/unz51x3.exe (MSDOS exe)
- garbo.uwasa.fi:/unix/arcers/zcrypt21.zip (encryption source)
-
- for Macintosh:
- mac.archive.umich.edu:/mac/util/compression/unzip2.01.cpt.hqx
- mac.archive.umich.edu:/mac/util/compression/zipit1.2.cpt.hqx
- ftp.uu.net:/pub/archiving/zip/MAC/mac-unzip-51.hqx
-
- .zoo: zoo 2.10 for MSDOS (algorithm copied from that of lha, see lha above)
- Contact: Rahul Dhesi <dhesi@cirrus.com>
- wuarchive.wustl.edu:/mirrors/msdos/zoo/zoo210.exe
- garbo.uwasa.fi:/pc/arcers/zoo210.exe
-
- zoo 2.10 for Unix, VMS
- oak.oakland.edu:/pub/misc/unix/zoo210.tar.Z
- garbo.uwasa.fi:/unix/arcers/zoo210.tar.Z
-
- zoo for Mac
- mac.archive.umich.edu:/mac/utilities/compressionapps/maczoo.sit.hqx
-
- zoo for Amiga
- ftp.funet.fi:pub/amiga/utilities/archivers/Zoo-2.1.lha
-
- .F: freeze for Unix (LZ77 with hashing, plus secondary dynamic Huffman
- encoding)
- wuarchive.wustl.edu:/usenet/comp.sources.misc/volume35/freeze/part0[1-3].Z
- ftp.inria.fr:/system/arch-compr/freeze-2.5.tar.Z
- Contact: Leonid A. Broukhis <leo@s514.ipmce.su>
-
- .Y: yabba for Unix, VMS, ... (Y coding, a variant of LZ78)
- wuarchive.wustl.edu:/usenet/comp.sources.unix/volume24/yabbawhap/part0[1-4].Z
- ftp.inria.fr:/system/arch-compr/yabba.tar.Z
- Contact: Dan Bernstein <djb@silverton.berkeley.edu>
-
- .Z: compress for Unix ('the' LZW algorithm)
- It is likely that your Unix system has 'compress' already. Otherwise:
- wuarchive.wustl.edu:/packages/compression/compress-4.1.tar
- (not in .Z format to avoid chicken and egg problem)
-
- compress for MSDOS
- oak.oakland.edu:/pub/msdos/compress/comp430[ds].zip
- garbo.uwasa.fi:/pc/unix/comp430d.zip
- garbo.uwasa.fi:/pc/source/comp430s.zip
-
- compress for Macintosh
- sumex-aim.stanford.edu:/info-mac/util/maccompress-32.hqx
-
- compress for Amiga
- ftp.funet.fi:pub/amiga/utilities/archivers/compress-4.1.lha
-
- compress for Vax/VMS
- wuarchive.wustl.edu:/packages/compression/vax-vms/lzcomp.exe
- wuarchive.wustl.edu:/packages/compression/vax-vms/lzdcmp.exe
-
- ------------------------------------------------------------------------------
-
- Subject: [3] What is the latest PKZIP version?
-
- The latest official version is 2.04g. Release 2.04c had serious bugs,
- corrected in 2.04e and 2.04g.
-
- Be warned that there are countless bogus PKZIP 1.20, 2.0, 2.02,
- 3.05 and whatever scams floating around. They usually are hacks of
- PKZIP 1.93A beta test version. Some of them are trojans and / or
- carry computer virii.
-
- Note about pkzip 2.06 from a PKware employee:
-
- Version 2.06 was released as an INTERNAL use only IBM version.
- It is identical to 2.04G, but it has IBM names in the help
- screens and such. That release is meant for IBM only.
-
- ------------------------------------------------------------------------------
-
- Subject: [4] What is an archiver?
-
-
- There is a distinction between archivers and other compression
- programs:
-
- - an archiver takes several input files, compresses them and produces
- a single archive file. Examples are arc, arj, lha, zip, zoo.
-
- - other compression programs create one compressed file for each
- input file. Examples are freeze, yabba, compress. Such programs
- are often combined with tar to create compressed archives (see
- question 50: "What is this tar compression program?").
-
- ------------------------------------------------------------------------------
-
- Subject: [5] What is the best general purpose compression program?
-
-
- The answer is: it depends. (You did not expect a definitive answer,
- did you?)
-
- It depends whether you favor speed, compression ratio, a standard and
- widely used archive format, the number of features, etc... Just as
- for text editors, personal taste plays an important role. compress has
- 4 options, arj 2.30 has about 130 options; different people like
- different programs. *Please* do not start or continue flame wars on
- such matters of taste.
-
- The only objective comparisons are speed and compression ratio. Here
- is a short table comparing various programs on a 33Mhz Compaq 386.
- All programs have been run on Unix SVR4, except pkzip and arj which
- only run on MSDOS. (MSDOS benchmarks are available by ftp on
- oak.oakland.edu:/pub/msdos/info/arctst*.zip.)
-
- *Please* do not post your own benchmarks made on your own files that
- nobody else can access. If you think that you must absolutely post yet
- another benchmark, make sure that your test files are available by
- anonymous ftp.
-
- The programs compared here were chosen because they are the most
- popular or because they run on Unix and source is available. For ftp
- information, see above. Three programs (hpack, comp-2 and ha) have
- been added because they achieve better compression (at the expense of
- speed) and one program (lzrw3-a) has been added because it favors
- speed at the expense of compression:
-
- - comp-2 is in wuarchive.wustl.edu:/mirrors/msdos/ddjmag/ddj9102.zip
- (inner zip file nelson.zip),
- - hpack is in wuarchive.wustl.edu:/mirrors/misc/unix/hpack75a.tar-z
- and garbo.uwasa.fi:/unix/arcers/hpack78src.tar.Z
- - ha 0.98 is in garbo.uwasa.fi:/pc/arcers/ha098.zip
- - ftp.adelaide.edu.au:/pub/compression/lzrw3-a.c [129.127.40.3]
-
- The 14 files used in the comparison are from the standard Calgary
- Text Compression Corpus, available by ftp on ftp.cpsc.ucalgary.ca
- [136.159.7.18] in /pub/text.compression.corpus/text.compression.corpus.tar.Z.
-
- The whole corpus includes 18 files, but the 4 files paper[3-6] are
- generally omitted in benchmarks. It contains several kinds of file
- (ascii, binary, image, etc...) but has a bias towards large files.
- You may well get different ratings on the typical mix of files that
- you use daily, so keep in mind that the comparisons given below are
- only indicative.
-
- The programs are ordered by decreasing total compressed size. For a
- fair comparison between archivers and other programs, this size is
- only the size of the compressed data, not the archive size.
-
- The programs were run on an idle machine, so the elapsed time
- is significant and can be used to compare Unix and MSDOS programs.
-
- [Note: I did not have time to run again all benchmarks with more
- recent versions of zip, freeze, arj and hpack. To be done for some
- future revision of this FAQ.]
-
- size lzrw3a compress lharc yabba pkzip freeze
- version: 4.0 1.02 1.0 1.10 2.3.5
- options: -m300000
- ------ ----- ------ ------ ------ ------ ------
- bib 111261 49040 46528 46502 40456 41354 41515
- book1 768771 416131 332056 369479 306813 350560 344793
- book2 610856 274371 250759 252540 229851 232589 230861
- geo 102400 84214 77777 70955 76695 76172 68626
- news 377109 191291 182121 166048 168287 157326 155783
- obj1 21504 12647 14048 10748 13859 10546 10453
- obj2 246814 108040 128659 90848 114323 90130 85500
- paper1 53161 24522 25077 21748 22453 20041 20021
- paper2 82199 39479 36161 35275 32733 32867 32693
- pic 513216 111000 62215 61394 65377 63805 53291
- progc 39611 17919 19143 15399 17064 14164 14143
- progl 71646 24358 27148 18760 23512 17255 17064
- progp 49379 16801 19209 12792 16617 11877 11686
- trans 93695 30292 38240 28092 31300 23135 22861
- 3,141,622 1,400,105 1,259,141 1,200,580 1,159,340 1,141,821 1,109,290
- real 0m35s 0m59s 5m03s 2m40s 5m27s
- user 0m25s 0m29s 4m29s 1m46s 4m58s
- sys 0m05s 0m10s 0m07s 0m18s 0m08s
- MSDOS: 1m39s
-
-
- zoo lha arj pkzip zip hpack comp-2 ha
- 2.10 1.0(Unix) 2.30 2.04g 1.9 0.75a 0.98
- ah 2.13(MSDOS) -jm -ex -6 a2
- ------ ------ ------ ------ ------- ------ ------ ------
- bib 40742 40740 36090 35186 34950 35619 29840 26927
- book1 339076 339074 318382 313566 312619 306876 237380 235733
- book2 228444 228442 210521 207204 206306 208486 174085 163535
- geo 68576 68574 69209 68698 68418 58976 64590 59356
- news 155086 155084 146855 144954 144395 141608 128047 123335
- obj1 10312 10310 10333 10307 10295 10572 10819 9799
- obj2 84983 84981 82052 81213 81336 80806 85465 80381
- paper1 19678 19676 18710 18519 18525 18607 16895 15675
- paper2 32098 32096 30034 29566 29674 29825 25453 23956
- pic 52223 52221 53578 52777 55051 51778 55461 51639
- progc 13943 13941 13408 13363 13238 13475 12896 11795
- progl 16916 16914 16408 16148 16175 16586 17354 15298
- progp 11509 11507 11308 11214 11182 11647 11668 10498
- trans 22580 22578 20046 19808 18879 20506 21023 17927
- 1,096,166 1,096,138 1,036,934 1,019,451 1,021,043 1,005,367 890,976 845,854
- real 4m07s 6m03s 1m49s 1h22m17s 27m05s
- user 3m47s 4m23s 1m43s 1h20m46s 19m27s
- sys 0m04s 0m08s 0m02s 0m12s 2m03s
- MSDOS: 1m49s 2m41s 1m43s 14m43s
-
- Notes:
-
- - the compressed data for 'zoo ah' is always two bytes longer than for
- lha. This is simply because both programs are derived from the same
- source (ar002, written by Haruhiko Okumura, available by ftp in
- wuarchive.wustl.edu:/mirrors/msdos/archivers/ar002.zip).
-
- - hpack 0.75a gives slightly different results on SunOS. (To be checked
- with latest version of hpack).
-
- - the MSDOS versions are all optimized with assembler code and were run
- on a RAM disk. So it is not surprising that they often go faster than
- their Unix equivalent.
-
- ------------------------------------------------------------------------------
-
- Subject: [7] Which books should I read?
-
-
- [BWC 1989] Bell, T.C, Cleary, J.G. and Witten, I.H, "Text Compression",
- Prentice-Hall 1989. ISBN: 0-13-911991-4. Price: approx. US$60
- The reference on text data compression.
-
- [Nel 1991] Mark Nelson, "The Data Compression Book"
- M&T Books, Redwood City, CA, 1991. ISBN 1-55851-216-0.
- Price $36.95 including two 5" PC-compatible disks bearing
- all the source code printed in the book.
- A practical introduction to data compression.
- The book is targeted at a person who is comfortable reading C code but
- doesn't know anything about data compression. Its stated goal is to get
- you up to the point where you are competent to program standard
- compression algorithms.
-
- [Will 1990] Williams, R. "Adaptive Data Compression", Kluwer Books, 1990.
- ISBN: 0-7923-9085-7. Price: US$75.
- Reviews the field of text data compression and then addresses the
- problem of compressing rapidly changing data streams.
-
- [Stor 1988] Storer, J.A. "Data Compression: Methods and Theory", Computer
- Science Press, Rockville, MD. ISBN: 0-88175-161-8.
- A survey of various compression techniques, mainly statistical
- non-arithmetic compression and LZSS compression. Includes complete Pascal
- code for a series of LZ78 variants.
-
- [Stor 1992] Storer, J.A. "Image and Text Compression", Kluwer Academic
- Publishers, 1992, ISBN 0-7923-9243-4
-
- [ACG 1991] Advances in Speech Coding, edited by Atal, Cuperman, and Gersho,
- Kluwer Academic Press, 1991.
-
- [GG 1991] Vector Quantization and Signal Compression, by Gersho and Gray,
- Kluwer Acad. Press, 1991, ISBN 0-7923-9181-0.
-
- [CT 1991] Elements of Information Theory, by T.M.Cover and J.A.Thomas
- John Wiley & Sons, 1991. ISBN 0-471-06259-6.
-
- Review papers:
-
- [BWC 1989] Bell, T.C, Witten, I.H, and Cleary, J.G. "Modeling for Text
- Compression", ACM Computing Surveys, Vol.21, No.4 (December 1989), p.557
- A good general overview of compression techniques (as well as modeling for
- text compression); the condensed version of "Text Compression".
-
- [Lele 1987] Lelewer, D.A, and Hirschberg, D.S. "Data Compression", ACM
- Computing Surveys, Vol.19, No.3 (September 1987), p.261.
- A survey of data compression techniques which concentrates on Huffman
- compression and makes only passing mention of other techniques.
-
-
- ------------------------------------------------------------------------------
-
- Subject: [8] What about patents on data compression algorithms?
-
-
- [Note: the appropriate group for discussing software patents is
- comp.patents (or misc.legal.computing), not comp.compression.]
-
- All patents mentioned here are US patents, and thus probably
- not applicable outside the US. See item 70, "Introduction to data
- compression" for the meaning of LZ77, LZ78 or LZW.
-
-
- (a) Run length encoding
-
- - Tsukiyama has two patents on run length encoding: 4,586,027 and 4,872,009
- granted in 1986 and 1989 respectively. The first one covers run length
- encoding in its most primitive form: a length byte followed by the
- repeated byte. The second patent covers the 'invention' of limiting the
- run length to 16 bytes and thus the encoding of the length on 4 bits.
- Here is the start of claim 1 of patent 4,872,009, just for pleasure:
-
- 1. A method of transforming an input data string comprising a plurality
- of data bytes, said plurality including portions of a plurality of
- consecutive data bytes identical to one another, wherein said data
- bytes may be of a plurality of types, each type representing different
- information, said method comprising the steps of: [...]
-
- - O'Brien has patented (4,988,998) run length encoding followed by LZ77.
-
-
- (b) LZ77
-
- - Waterworth patented (4,701,745) the algorithm now known as LZRW1,
- because Ross Williams reinvented it later and posted it on
- comp.compression on April 22, 1991. (See item 5 for the ftp site
- with all LZRW derivatives.) The *same* algorithm has later been
- patented by Gibson & Graybill (see below). The patent office failed
- to recognize that the same algorithm was patented twice, even though
- the wording used in the two patents is very similar.
-
- The Waterworth patent is now owned by Stac Inc, which won a lawsuit
- against Microsoft, concerning the compression feature of MSDOS 6.0.
- Damages awarded were $120 million.
-
- - Fiala and Greene obtained in 1990 a patent (4,906,991) on all
- implementations of LZ77 using a tree data structure. Claim 1 of the
- patent is much broader than the algorithms published by Fiala and
- Greene in Comm.ACM, April 89. The patent covers the algorithm
- published by Rodeh and Pratt in 1981 (J. of the ACM, vol 28, no 1,
- pp 16-24). It also covers the algorithm previously patented by
- Eastman-Lempel-Ziv (4,464,650), and the algorithms used in lharc,
- lha and zoo.
-
- - Notenboom (from Microsoft) 4,955,066 uses three levels of
- compression, starting with run length encoding.
-
- - The Gibson & Graybill patent 5,049,881 covers the LZRW1 algorithm
- previously patented by Waterworth and reinvented by Ross Williams.
- Claims 4 and 12 are very general and could be interpreted as
- applying to any LZ algorithm using hashing (including all variants
- of LZ78):
-
- 4. A compression method for compressing a stream of input data into
- a compressed stream of output data based on a minimum number of
- characters in each input data string to be compressed, said
- compression method comprising the creation of a hash table, hashing
- each occurrence of a string of input data and subsequently searching
- for identical strings of input data and if such an identical string
- of input data is located whose string size is at least equal to the
- minimum compression size selected, compressing the second and all
- subsequent occurrences of such identical string of data, if a string
- of data is located which does not match to a previously compressed
- string of data, storing such data as uncompressed data, and for each
- input strings after each hash is used to find a possible previous
- match location of the string, the location of the string is stored
- in the hash table, thereby using the previously processed data to
- act as a compression dictionary.
-
- Claim 12 is identical, with 'method' replaced with 'apparatus'. Since
- the 'minimal compression size' can be as small as 2, the claim could
- cover any dictionary technique of the LZ family. However the text of the
- patent and the other claims make clear that the patent should cover the
- LZRW1 algorithm only. (In any case the Gibson & Graybill patent is likely
- to be invalid because of the prior art in the Waterworth patent.)
-
- - Phil Katz, author of pkzip, also has a patent on LZ77 (5,051,745)
- but the claims only apply to sorted hash tables, and when the hash
- table is substantially smaller than the window size.
-
- - IBM patented (5,001,478) the idea of combining a history buffer (the
- LZ77 technique) and a lexicon (as in LZ78).
-
- - Stac Inc patented (5,016,009 and 5,126,739) yet another variation of LZ77
- with hashing. The '009 patent was used in the lawsuit against Microsoft
- (see above). Stac also has patents on LZ77 with parallel lookup in
- hardware (4,841,092 and 5,003,307).
-
- - Robert Jung, author of 'arj', has been granted patent 5,140,321
- for one variation of LZ77 with hashing. This patent covers the LZRW3-A
- algorithm, also previously discovered by Ross Williams. LZRW3-A was posted
- on comp.compression on July 15, 1991. The patent was filed two months later
- on Sept 4, 1991. (The US patent system allows this because of the
- 'invention date' rule.)
-
- - Chambers 5,155,484 is yet another variation of LZ77 with hashing.
- The hash function is just the juxtaposition of two input bytes,
- this is the 'invention' being patented. The hash table is named
- 'direct lookup table'.
-
-
- (c) LZ78
-
- - One form of the original LZ78 algorithm was patented (4,464,650) by
- its authors Lempel, Ziv, Cohn and Eastman.
-
- - The LZW algorithm used in 'compress' is patented by IBM (4,814,746)
- and Unisys (4,558,302). It is also used in the V.42bis compression
- standard (see question 11 on V.42bis below) and in Postscript Level 2.
- (Unisys sells the license to modem manufacturers for a onetime
- $25,000 fee.) The IBM patent application was filed three weeks
- before that of Unisys, but the US patent office failed to recognize
- that they covered the same algorithm. (The IBM patent is more
- general, but its claim 7 is exactly LZW.)
-
- - AP coding is patented by Storer (4,876,541). (Get the yabba package
- for source code, see question 2 above, file type .Y)
-
-
- (d) arithmetic coding
-
- - IBM holds many patents on arithmetic coding (4,286,256 4,295,125
- 4,463,342 4,467,317 4,633,490 4,652,856 4,891,643 4,905,297 4,935,882).
- It has patented in particular the Q-coder implementation of arithmetic
- coding. The arithmetic coding option of the JPEG standard requires
- use of the patented algorithm. No JPEG-compatible method is
- possible without infringing the patent, because what IBM actually
- claims rights to is the underlying probability model (the heart of
- an arithmetic coder). (See the JPEG FAQ for details.)
-
- AT&T has 3 patents on arithmetic coding (4,973,961, 5,023,611, 5,025,258).
-
-
- As can be seen from the above list, some of the most popular compression
- programs (compress, pkzip, zoo, lha, arj) are now covered by patents.
- (This says nothing about the validity of these patents.)
-
- Here are some references on data compression patents. A number of them are
- taken from the list prep.ai.mit.edu:/pub/lpf/patent-list.
-
- 3,914,586
- Data compression method and apparatus
- filed 10/25/73, granted 10/21/75
- General Motors Corporation, Detroit MI
- Duane E. McIntosh, Santa Ynez CA
- Data compression apparatus is disclosed is operable in either a bit
- pair coding mode of a word coding mode depending on the degree of
- redundancy of the data to be encoded.
-
- 3,976,844
- Data communication system for transmitting data in compressed form
- filed Apr. 4, 1975, granted Aug. 24, 1976
- inventor Bernard K. Betz, assignee Honeywell Information Systems, Inc.
- [encode differences with previous line]
-
- 4,021,782
- Data compaction system and apparatus
- inventor Hoerning
- filed 04/30/1975, granted 05/03/1977
- [A primitive form of LZ77 with implicit offsets (compare with previous record)]
-
- 4,054,951
- Data expansion apparatus
- inventor R.D. Jackson, assignee IBM
- filed Jun. 30, 1976, granted Oct. 18, 1977
- [Covers only decompression of data compressed with a variant of LZ77.]
-
- 4,087,788
- Data compression system
- filed 1/14/77, granted 5/2/78
- NCR Canada LTD - NCR Canada Ltee, Mississauga CA
- Brian J. Johannesson, Waterloo CA
- A data compression system is disclosed in which the left hand boundary
- of a character is developed in the form of a sequence of Freeman
- direction codes, the codes being stored in digital form within a
- processor.
-
- 4,286,256
- Method and means for arithmetic coding using a reduced number of operations.
- granted Aug 25, 1981
- assignee IBM
-
- 4,295,125
- A method and means for pipeline decoding of the high to low order pairwise
- combined digits of a decodable set of relatively shifted finite number of
- strings
- granted Oct 13, 1981
- assignee IBM
-
- 4,412,306
- System for minimizing space requirements for storage and transmission of
- digital signals
- filed May 14, 1981, granted Oct. 25, 1983
- inventor Edward W. Moll
-
- 4,463,342
- A method and means for carry-over control in a high order to low order
- combining of digits of a decodable set of relatively shifted finite number
- strings.
- granted Jul 31, 1984
- assignee IBM
-
- 4,491,934
- Data compression process
- filed May 12, 1982, granted Jan. 1, 1985
- inventor Karl E. Heinz
-
- 4,464,650
- Apparatus and method for compressing data signals and restoring the
- compressed data signals
- inventors Lempel, Ziv, Cohn, Eastman
- assignees Sperry Corporation and At&T Bell Laboratories
- filed 8/10/81, granted 8/7/84
- A compressor parses the input data stream into segments where each
- segment comprises a prefix and the next symbol in the data stream
- following the prefix.
-
- 4,467,317
- High-speed arithmetic compression using using concurrent value updating.
- granted Aug 21, 1984
- assignee IBM
-
- 4,494,108
- Adaptive source modeling for data file compression within bounded memory
- filed Jun. 5, 1984, granted Jan. 15, 1985
- invntors Glen G. Langdon, Jorma J. Rissanen
- assignee IBM
- order 1 Markov modeling
-
- 4,558,302
- High speed data compression and decompression apparatus and method
- inventor Welch
- assignee Sperry Corporation (now Unisys)
- filed 6/20/83, granted 12/10/85
- The text for this patent can be ftped from rusmv1.rus.uni-stuttgart.de
- (129.69.1.12) in /info/comp.patents/US4558302.Z.
-
- 4,560,976
- Data compression
- filed 6/5/84, granted 12/24/85
- Codex Corporation, Mansfield MA
- Steven G. Finn, Framingham, MA
- A stream of source characters, which occur with varying relative
- frequencies, is encoded into a compressed stream of codewords, each
- having one, two or three subwords, by ranking the source characters by
- their current frequency of appearance, encoding the source characters
- having ranks no higher than a first number as one subword codewords,
- source characters having ranks higher than the first number but no
- higher than a second number as two subword codewords, and the
- remaining source characters as three subword codewords.
-
- 4,586,027
- Method and system for data compression and restoration
- inventor Tsukimaya et al.
- assignee Hitachi
- filed 08/07/84, granted 04/29/86
- patents run length encoding
-
- 4,597,057
- System for compressed storate of 8-bit ascii bytes using coded strings
- of 4-bit nibbles.
- inventor Snow, assignee System Development corporation.
- filed 12/31/1981, granted 06/24/1986.
- Compression using static dictionary of common words, prefixes and suffixes.
-
- 4,612,532
- Data compression apparatus and method
- inventor Bacon, assignee Telebyte Corportion
- filed Jun. 19, 1984, granted Sep. 16, 1986
- [Uses followsets as in the pkzip 0.92 'reduce' algorithm, but the
- followsets are dynamically updated. This is in effect a sort of order-1
- Markov modeling.]
-
- 4,622,545
- Method and apparatus for image compression and Manipulation
- inventor William D. Atkinson
- assignee Apple computer Inc.
- filed 9/30/82
- granted 11/11/86
-
- 4,633,490
- Symmetrical adaptive data compression/decompression system.
- granted Dec 30, 1985
- assignee IBM
-
- 4,652,856
- A multiplication-free multi-alphabet arithmetic code.
- granted Feb 4, 1986
- assignee IBM
-
- 4,667,649
- Data receiving apparatus
- filed 4/18/84, granted 6/30/87
- inventors Kunishi et al.
- assignee Canon Kabushiki Kaisha, Tokyo Japan
- compression of Fax images.
-
- 4,682,150
- Data compression method and apparatus
- inventors Mathes and Protheroe,
- assignee NCR Corporation, Dayton OH
- A system and apparatus for compressing redundant and nonredundant
- binary data generated as part of an operation of a time and attendance
- terminal in which the data represents the time an employee is present
- during working hours.
-
- 4,701,745
- Data compression system
- inventor Waterworth John R
- assignee Ferranti PLC GB, patent rights now acquired by Stac Inc.
- filed 03/03/1986 (03/06/1985 in GB), granted 10/20/1987
- Algorithm now known as LZRW1 (see above)
- I claim:
- 1. A data compression system comprising an input store for receiving
- and storing a plurality of bytes of uncompressed data from an outside
- source, and data processing means for processing successive bytes of
- data from the input store;
- the data processing means including circuit means operable to check
- whether a sequence of successive bytes to be processed identical with
- a sequence of bytes already processed, and including hash generating
- means responsive to the application of a predetermined number of
- bytes in sequence to derive a hash code appropriate to those bytes, a
- temporary store in which the hash code may represent the address of a
- storage location, and a pointer counter operable to store in the
- temporary store at said address a pointer indicative of the position
- in the input store of one of the predetermined number of bytes;
- output means operable to apply to a transfer medium each byte of data
- not forming part of such an identical sequence; and
- encoding means responsive to the identification of such a sequence to
- apply to the transfer medium an identification signal which identifies
- both the location in the input store of the previous occurrence of the
- sequence of bytes and the number of bytes contained in the sequence.
-
- 4,730,348
- Adaptive data compression system
- inventor MacCrisken, assignee Adaptive Computer Technologies
- filed Sep. 19, 1986, granted Mar. 8, 1988
- [order-1 Markov modeling + Huffman coding + LZ77]
-
- 4,758,899
- Data compression control device
- inventor Tsukiyama, assignee Hitachi
- filed 11/20/1985, granted 07/19/1988
- Limits compression to ensure that tape drive stays busy.
-
- 4,809,350
- Data compression system
- filed Jan. 30, 1987, granted Feb. 28, 1989
- inventor Yair Shimoni & Ron Niv
- assignee Elscint Ltd., Haifa, Israel
- [Image compression via variable length encoding of differences with
- predicted data.]
-
- 4,814,746
- Data compression method
- inventors Victor S. Miller, Mark N. Wegman
- assignee IBM
- filed 8/11/86, granted 3/21/89
- A previous application was filed on 6/1/83, three weeks before the
- application by Welch (4,558,302)
- Communications between a Host Computing System and a number of remote
- terminals is enhanced by a data compression method which modifies the
- data compression method of Lempel and Ziv by addition of new character
- and new string extensions to improve the compression ratio, and
- deletion of a least recently used routine to limit the encoding tables
- to a fixed size to significantly improve data transmission efficiency.
-
- 4,841,092
- continued in 5,003,307
-
- 4,853,696
- Code converter for data compression/decompression
- filed 4/13/87, granted 8/1/89
- inventor Amar Mukherjee, Maitland FL
- assignee University of Central Florida, Orlando FL
- Another hardware Huffman encoder:
- A code converter has a network of logic circuits connected in reverse
- binary tree fashion with logic paths between leaf nodes and a common
- root node.
-
- 4,872,009
- Method and apparatus for data compression and restoration
- inventor Tsukimaya et al.
- assignee Hitachi
- filed 12/07/87, granted 10/03/89
- This patent on run length encoding covers the 'invention' of limiting
- the run length to 16 bytes and thus the encoding of the length on 4 bits.
-
- 4,876,541
- Stem [sic] for dynamically compressing and decompressing electronic data
- filed 10/15/87, granted 10/24/89
- inventor James A. Storer
- assignee Data Compression Corporation
- A data compression system for encoding and decoding textual data,
- including an encoder for encoding the data and for a decoder for
- decoding the encoded data.
-
- 4,891,643
- Arithmetic coding data compression/de-compression by selectively
- employed, diverse arithmetic encoders and decoders.
- granted Jan 2, 1990
- assignee IBM
-
- 4,905,297
- granted Feb 27, 1990
- assignee IBM
- Arithmetic coding encoder and decoder system.
-
- 4,906,991
- Textual substitution data compression with finite length search window
- filed 4/29/1988, granted 3/6/1990
- inventors Fiala,E.R., and Greene,D.H.
- assignee Xerox Corporation
-
- 4,935,882
- Probability adaptation for arithmetic coders.
- granted Jun 19, 1990
- assignee IBM
-
- 4,941,193
- Barnsley, fractal compression.
-
- 4,943,869
- Compression Method for Dot Image Data
- filed 1988-05-04, granted 1990-07-24
- assignee Fuji Photo Film Co.
- Lossy and lossless image compression schemes.
-
- 4,955,066
- Compressing and Decompressing Text Files
- filed 10/13/89, granted 09/04/90
- inventor Notenboom, L.A.
- assignee Microsoft
- Now extended as 5,109,433
- [Noted in signon screen of Word 5.5 and on the outside of the MS-DOS 5.0
- Upgrade.]
- A method of compressing a text file in digital form is disclosed.
- A full text file having characters formed into phrases is provided by an
- author. The characters are digitally represented by bytes. A first pass
- compression is sequentially followed by a second pass compression of the
- text which has previously been compressed. A third or fourth level of
- compression is serially performed on the compressed text. For example, in
- a first pass, the text is run-length compressed. In a second pass, the
- compressed text is further compressed with key phrase compression. In a
- third pass, the compressed text is further compressed with Huffman
- compression. The compressed text is stored in a text file having a Huffman
- decode tree, a key phrase table, and a topic index. The data is
- decompressed in a single pass and provided one line at a time as an output.
- Sequential compressing of the text minimizes the storage space required for
- the file. Decompressing of the text is performed in a single pass. As a
- complete line is decompressed, it is output rapidly, providing full text to
- the user.
-
- 4,973,961
- Method and apparatus for carry-over control in arithmetic coding.
- granted Nov 27, 1990
- assignee AT&T
-
- 4,988,998
- Data compression system for successively applying at least two data
- compression methods to an input data stream.
- inventor O'Brien
- assignee Storage Technology Corporation, Louisville, Colorado
- filed Sep 5, 1989, granted Jan 29, 1991.
- Run length encoding followed by LZ77.
-
- 5,001,478
- Method of Encoding Compressed Data
- filed 12/28/89, granted 03/19/91
- inventor Michael E. Nagy
- assignee IBM
- 1. A method of encoding a compressed data stream made up of a sequence of
- literal references, lexicon references and history references, which
- comprises the steps of:
- assigning to each literal reference a literal identifier;
- assigning to each history reference a history identifier;
- assigning to each lexicon reference a lexicon identifier;
- and emitting a data stream with said identifiers assigned to said references.
- Gordon Irlam <gordoni@cs.adelaide.edu.au> says:
- The invention can probably be best understood by considering the
- decompressor. It consists of a history buffer, and a lexicon buffer, both
- of which are initially empty. The history buffer contains the last n
- symbols emitted. Whenever a history buffer reference is to be output the
- string so referenced is subsequently moved to the lexicon buffer for future
- reference. Thus the history buffer keeps track of strings that may be
- repeated on a very short term basis, while the lexicon buffer stores items
- for a longer time. Furthermore a history reference involves specifying
- both the offset and length within the history buffer, whereas a lexicon
- reference simply specifies a number denoting the string. Both buffers have
- a finite size.
-
- 5,003,307
- Data compression apparatus with shift register search means
- filed Oct. 6, 1989, granted Mar. 26, 1991
- inventors George Glen A, Ivey Glen E, Whiting Douglas L
- assignee Stac Inc
- continuation of 4,841,092
-
- 5,016,009
- Data compression apparatus and method
- filed 01/13/1989, granted 05/14/1991
- inventors George Glen A, Ivey Glen E, Whiting Douglas L
- assignee Stac Inc
- LZ77 with offset hash table (extended in 5,126,739)
-
- 5,023,611
- Entropy encoder/decoder including a context extractor.
- granted Jun 11, 1991
- assignee AT&T
-
- 5,025,258
- Adaptive probability estimator for entropy encoder/decoder.
- granted Jun 18, 1991
- assignee AT&T
-
- 5,049,881
- Apparatus and method for very high data rate-compression incorporating
- lossless data compression and expansion utilizing a hashing technique
- inventors Dean K. Gibson, Mark D. Graybill
- assignee Intersecting Concepts, Inc.
- filed 6/18/90, granted 9/17/91
- [covers lzrw1, almost identical with Waterworth 4,701,745]
-
- 5,051,745
- String searcher, and compressor using same
- filed 8/21/90, granted 9/24/91
- inventor Phillip W. Katz (author of pkzip)
- In the string search method and apparatus pointers to the string to be
- searched are indexed via a hashing function and organized according to the
- hashing values of the string elements pointed to. The hashing function is
- also run on the string desired to be found, and the resulting hashing value
- is used to access the index. If the resulting hashing value is not in the
- index, it is known that the target string does not appear in the string
- being searched. Otherwise the index is used to determine the pointers which
- correspond to the target hashing value, these pointers pointing to likely
- candidates for matching the target string. The pointers are then used to
- sequentially compare each of the locations in the string being searched to
- the target string, to determine whether each location contains a match to
- the target string.
- In the method and apparatus for compressing a stream of data symbols, a
- fixed length search window, comprising a predetermined contiguous portion
- of the symbol stream, is selected as the string to be searched by the
- string searcher. If a string to be compressed is found in the symbol
- stream, a code is output designating the location within the search window
- of the matching string and the length of the matching string.
-
- 5,065,447
- Barnsley, fractal compression
-
- 5,109,433
- Compressing and decompressing text files
- inventor Notenboom
- assignee Microsoft
- extension of 4,955,066
-
- 5,126,739
- Data Compression Apparatus and Method
- filed Nov. 27, 1990, granted June 30, 1992.
- inventor Whiting et. al
- assignee Stac Inc
- LZ77 with offset hash table (extension of 5,016,009)
-
- 5,140,321
- Data compression/decompression method and apparatus
- filed 9/4/91, granted 8/18/92
- inventor Robert Jung
- assignee Prime Computer
-
- 5,155,484
- Fast data compressor with direct lookup table indexing into history buffer
- filed 9/13/1991, granted 10/13/1992
- inventor Chambers, IV, Lloyd L., Menlo Park, California
- assignee Salient Software, Inc., Palo Alto, California (02)
- Uses a 64K hash table indexed by the first two characters of
- the input string. Includes several claims on the LZ77 file format
- (literal or pair offset,length).
-
- 5,179,378
- file Jul. 30, 1991, granted Jan. 12, 1993
- inventor Ranganathan
- assignee University of South Florida
- Method and apparatus for the compression and decompression of data
- using Lempel-Ziv based techniques.
- [This covers LZ77 hardware compression with a systolic array of
- processors working in parallel.]
-
- Japan 2-46275
- Coding system
- granted Feb 26, 1990
- [Patents one form of arithmetic coding.]
-
- ------------------------------------------------------------------------------
-
- Subject: [9] The WEB 16:1 compressor.
-
-
- [WARNING: this topic has generated the greatest volume of news in the
- history of comp.compression. Read this before posting on this subject.]
-
- [WARNING 2: it is quite possible that the story is repeating itself
- with a compressor called MINC by Premier Research Corporation, Ltd.
- They claim a breakthrough in lossless data compression using a recursive
- method, that is, applying the compressor to the compressed output of
- the previous run.]
-
- [WARNING 3: the OWS program, which also claims incredible compression
- ratios, is a hoax. It just remembers the clusters which contained
- the data. The data can be recovered only if those clusters are not
- used again for another file. Needless to say, never trust such a
- lossy program.]
-
- (a) What the press says
-
- April 20, 1992 Byte Week Vol 4. No. 25:
-
- "In an announcement that has generated high interest - and more than a
- bit of skepticism - WEB Technologies (Smyrna, GA) says it has
- developed a utility that will compress files of greater than 64KB in
- size to about 1/16th their original length. Furthermore, WEB says its
- DataFiles/16 program can shrink files it has already compressed."
- [...]
- "A week after our preliminary test, WEB showed us the program successfully
- compressing a file without losing any data. But we have not been able
- to test this latest beta release ourselves."
- [...]
- "WEB, in fact, says that virtually any amount of data can be squeezed
- to under 1024 bytes by using DataFiles/16 to compress its own output
- multiple times."
-
- June 1992 Byte, Vol 17 No 6:
-
- [...] According to Earl Bradley, WEB Technologies' vice president of
- sales and marketing, the compression algorithm used by DataFiles/16
- is not subject to the laws of information theory. [...]
-
-
- (b) First details, by John Wallace <buckeye@spf.trw.com>:
-
- I called WEB at (404)514-8000 and they sent me some product
- literature as well as chatting for a few minutes with me on the phone.
- Their product is called DataFiles/16, and their claims for it are
- roughly those heard on the net.
-
- According to their flier:
-
- "DataFiles/16 will compress all types of binary files to approximately
- one-sixteenth of their original size ... regardless of the type of
- file (word processing document, spreadsheet file, image file,
- executable file, etc.), NO DATA WILL BE LOST by DataFiles/16."
- (Their capitalizations; 16:1 compression only promised for files >64K
- bytes in length.)
-
- "Performed on a 386/25 machine, the program can complete a
- compression/decompression cycle on one megabyte of data in less than
- thirty seconds"
-
- "The compressed output file created by DataFiles/16 can be used as the
- input file to subsequent executions of the program. This feature of
- the utility is known as recursive or iterative compression, and will
- enable you to compress your data files to a tiny fraction of the
- original size. In fact, virtually any amount of computer data can
- be compressed to under 1024 bytes using DataFiles/16 to compress its
- own output files muliple times. Then, by repeating in reverse the
- steps taken to perform the recusive compression, all original data
- can be decompressed to its original form without the loss of a single
- bit."
-
- Their flier also claims:
-
- "Constant levels of compression across ALL TYPES of FILES"
- "Convenient, single floppy DATA TRANSPORTATION"
-
- From my telephone conversation, I was was assured that this is an
- actual compression program. Decompression is done by using only the
- data in the compressed file; there are no hidden or extra files.
-
-
- (c) More information, by Rafael Ramirez <rafael.ramirez@channel1.com>:
-
- Today (Tuesday, 28th) I got a call from Earl Bradley of Web
- who now says that they have put off releasing a software version of
- the algorithm because they are close to signing a major contract with
- a big company to put the algorithm in silicon. He said he could not
- name the company due to non-disclosure agreements, but that they had
- run extensive independent tests of their own and verified that the
- algorithm works. [...]
-
- He said the algorithm is so simple that he doesn't want anybody
- getting their hands on it and copying it even though he said they
- have filed a patent on it. [...] Mr. Bradley said the silicon version
- would hold up much better to patent enforcement and be harder to copy.
-
- He claimed that the algorithm takes up about 4K of code, uses only
- integer math, and the current software implementation only uses a 65K
- buffer. He said the silicon version would likely use a parallel
- version and work in real-time. [...]
-
-
- (d) The impossiblity proofs.
-
- It is impossible for a given program to compress without loss *all*
- files greater than a certain size by at least one bit. This can be
- proven by a simple counting argument. (Many other proofs have been
- posted on comp.compression, *please* do not post yet another one.)
-
- Assume that the program can compress without loss all files of size >= N
- bits. Compress with this program all the 2^N files which have
- exactly N bits. All compressed files have at most N-1 bits, so there
- are at most (2^N)-1 different compressed files [2^(N-1) files of size
- N-1, 2^(N-2) of size N-2, and so on, down to 1 file of size 0]. So at
- least two different input files must compress to the same output file.
- Hence the compression program cannot be lossless. (Stronger results
- about the number of incompressible files can be obtained, but the
- proofs are a little more complex.)
-
- This argument applies of course to WEB's case (take N = 64K*8 bits).
- Note that no assumption is made about the compression algorithm.
- The proof applies to *any* algorithm, including those using an
- external dictionary, or repeated application of another algorithm,
- or combination of different algorithms, or representation of the
- data as formulas, etc... All schemes are subject to the counting argument.
- There is no need to use information theory to provide a proof, just
- basic mathematics.
-
- This assumes of course that the information available to the decompressor
- is only the bit sequence of the compressed data. If external information
- such as a file name or a number of iterations is necessary to decompress
- the data, the bits providing the extra information must be included in
- the bit count of the compressed data. (Otherwise, it would be sufficient
- to consider any input data as a number, use this as the iteration
- count or file name, and pretend that the compressed size is zero.)
- For an example of storing information in the file name, see the program
- lmfjyh in the 1993 International Obfuscated C Code Contest, available
- on all comp.sources.misc archives (Volume 39, Issue 104).
-
- [See also question 73 "What is the theoretical compression limit?" in
- part 2 of this FAQ.]
-
-
- (e) No software version
-
- Appeared on BIX, reposted by Bruce Hoult <Bruce.Hoult@actrix.gen.nz>:
-
- tojerry/chaos #673, from abailey, 562 chars, Tue Jun 16 20:40:34 1992
- Comment(s).
- ----------
- TITLE: WEB Technology
- I promised everyone a report when I finally got the poop on WEB's
- 16:1 data compression. After talking back and forth for a year
- and being put off for the past month by un-returned phone calls,
- I finally got hold of Marc Spindler who is their sales manager.
-
- _No_ software product is forth coming, period!
-
- He began talking about hardware they are designing for delivery
- at the end of the year. [...]
-
-
- (f) Product cancelled
-
- Posted by John Toebes <toebes@bnr.ca> on Aug 10th, 1992:
-
- [Long story omitted, confirming the reports made above about the
- original WEB claims.]
-
- 10JUL92 - Called to Check Status. Was told that testing had uncovered a
- new problem where 'four numbers in a matrix were the same
- value' and that the programmers were off attempting to code a
- preprocessor to eliminate this rare case. I indicated that he
- had told me this story before. He told me that the
- programmers were still working on the problem.
-
- 31JUL92 - Final Call to Check Status. Called Earl in the morning and
- was told that he still had not heard from the programmers. [...]
- Stated that if they could not resolve the problem then there would
- probably not be a product.
-
- 03AUG92 - Final Call. Earl claims that the programmers are unable to
- resolve the problem. I asked if this meant that there would
- not be a product as a result and he said yes.
-
-
- (g) Conclusion
-
- The last report given above should put an end to the WEB story.
-
- [Note from the FAQ maintainer: I intended to remove this story from
- the FAQ, but the recent announcement of the MINC compressor has some
- similarities with the WEB story so it is useful to keep it a little
- longer.]
-
- ------------------------------------------------------------------------------
-
- Subject: [11] What is the V.42bis standard?
-
-
- A description of the V.42bis standard is given in "The V.42bis
- standard for data-compressing modems," by Clark Thomborson
- <cthombor@theory.lcs.mit.edu>, IEEE Micro, Oct 1992, pp. 41-53.
-
- Short introduction, by Alejo Hausner <hausner@qucis.queensu.ca>:
-
- The V.42bis Compression Standard was proposed by the International
- Consultative Committee on Telephony and Telegraphy (CCITT) as an
- addition to the v.42 error-correction protocol for modems. Its purpose
- is to increase data throughput, and uses a variant of the
- Lempel-Ziv-Welch (LZW) compression method. It is meant to be
- implemented in the modem hardware, but can also be built into the
- software that interfaces to an ordinary non-compressing modem.
-
- V.42bis can send data compressed or not, depending on the
- data. There are some types of data that cannot be
- compressed. For example, if a file was compressed first,
- and then sent through a V.42bis modem, the modem would not
- likely reduce the number of bits sent. Indeed it is likely
- that the amount of data would increase somewhat.
-
- To avoid this problem, the algorithm constantly monitors the
- compressibility of the data, and if it finds fewer bits
- would be necessary to send it uncompressed, it switches to
- transparent mode. The sender informs the receiver of this
- transition through a reserved escape code. Henceforth the
- data is passed as plain bytes.
-
- The choice of escape code is clever. Initially, it is a
- zero byte. Any occurrence of the escape code is replaced,
- as is customary, by two escape codes. In order to prevent a
- string of escape codes from temporarily cutting throughput
- in half, the escape code is redefined by adding 51 mod 256
- each time it is used.
-
- While transmitting in transparent mode, the sender maintains
- the LZW trees of strings, and expects the receiver to do
- likewise. If it finds an advantage in returning to
- compressed mode, it will do so, first informing the receiver
- by a special control code. Thus the method allows the
- hardware to adapt to the compressibility of the data.
-
-
- The CCITT standards documents used to be available by ftp on
- ftp.uu.net in /doc/standards/ccitt, but this service has been
- discontinued. If you ftp to digital.resource.org, in directory pub/standards
- there is a file that says that making the standards available in the
- first place was just an experiment.
-
- The documents are now on src.doc.ic.ac.uk, but the directory name
- keeps changing. Check one of:
- /computing/ccitt/ccitt-standards/ccitt/
- /computing/ccitt/standards/ccitt
- /doc/ccitt-standards/ccitt
- in this order. The v42bis standard is in *standards/ccitt/1992/v/v42bis.asc.Z.
-
-
- A mail server for CCITT documents is available at teledoc@itu.arcom.ch
- or itudoc@itu.ch. A Gopher server is also available:
-
- Name=International Telecommunication Union (ITU)
- Host=info.itu.ch
- Port=70
-
- For more information, contact Robert Shaw <shaw@itu.arcom.ch> or
- Antoinette Bautista <bautista@itu.arcom.ch>. Warning by John Levine
- <johnl@iecc.cambridge.ma.us> (probably obsolete by now):
-
- This teledoc thing is much less than meets the eye. What it
- actually has is one-page abstracts of some but not all CCITT
- recommendations, along with junk like lists of the national
- representatives to CCITT. If you want the actual text of a
- recommendation, you have to send large amounts of money to
- Switzerland, same as ever. However, a closer reading of the Teledoc
- announcement shows that they say they're planning to make the actual
- text of some CCITT recommendations available on-line sometime in 1993.
-
-
- See also the Standards FAQ posted to news.answers or get it by ftp in
- rtfm.mit.edu:/pub/usenet/news.answers/standards-faq.
-
- ------------------------------------------------------------------------------
-
- Subject: [12] I need source for the winners of the Dr Dobbs compression contest
-
-
- The source of the top 6 programs of the Feb 91 Dr Dobbs data compression
- contest are available by ftp on
- oak.oakland.edu:/pub/msdos/compress/ddjcompr.zip
- garbo.uwasa.fi:/pc/source/ddjcompr.zip [128.214.87.1]
-
- The sources are in MSDOS end-of-line format, one directory per
- program. Unix or VMS users, use "unzip -a ddjcompr" to get correct
- end-of-lines (add -d to recreate the directory structure if you are
- using an obsolete version of unzip such as 4.1). Three of the 6
- programs are not portable and only run on MSDOS. compact and urban
- work on Unix, sixpack only requires minor modifications.
-
- ------------------------------------------------------------------------------
-
- Subject: [13] I need source for arithmetic coding
-
-
- (See question 70 for an introduction to arithmetic coding.)
-
- The source for the arithmetic coder described in Chap.5 of Bell,
- Cleary, and Witten's book "Text Compression" (see question 7 above)
- (or, equivalently, in: Witten, Neal, and Cleary's article "Arithmetic
- Coding for data Compression" from Communications of the Association
- for Computing Machinery, 30 (6), pp.520-540, June, 1987) is available
- via anonymous ftp from ftp.cpsc.ucalgary.ca (136.159.7.18) in directory
- /pub/arithmetic.coding. It only comes with a simple order-0 model but
- it's set up so that adding your own more sophisticated one is
- straightforward.
-
- A low precision arithmetic coding implementation avoiding hardware
- division is available on the same site (ftp.cpsc.ucalgary.ca)
- in /pub/arithmetic.coding/low.precision.version/low.precision.version.shar.
-
- Kris Popat <popat@image.mit.edu> has worked on "Scalar Quantization
- with Arithmetic Coding." It describes an arithmetic coding technique
- which is quite general and computationally inexpensive. The
- documentation and example C code are available via anonymous ftp from
- media-lab.media.mit.edu (18.85.0.2), in /pub/k-arith-code.
-
- The program 'urban' in ddjcompr.zip (see item 12 above) is a high order
- arithmetic coder working at the bit level. It is written by Urban Koistinen
- <md85-epi@nada.kth.se>.
-
- ------------------------------------------------------------------------------
-
- Subject: [15] Where can I get image compression programs?
-
-
- JPEG:
- Source code for most any machine:
- ftp.uu.net:/graphics/jpeg/jpegsrc.v4.tar.Z [137.39.1.9]
- nic.funet.fi:/pub/graphics/packages/jpeg/jpegsrc.v4.tar.Z [128.214.6.100]
- Contact: jpeg-info@uunet.uu.net (Independent JPEG Group)
-
- havefun.stanford.edu:pub/jpeg/JPEGv1.2.tar.Z (supports lossless mode)
- Contact: Andy Hung <achung@cs.stanford.edu> (see item 20 below)
-
- xv, an image viewer which can read JPEG pictures, is available in
- export.lcs.mit.edu: contrib/xv-2.21.tar.Z [18.24.0.12]
-
- MPEG:
- havefun.stanford.edu:/pub/mpeg/MPEGv1.2.alpha.tar.Z
- Contact: Andy Hung <achung@cs.stanford.edu> (see item 20 below)
-
- toe.cs.berkeley.edu:/pub/multimedia/mpeg/mpeg_play-2.0.tar.Z
- toe.cs.berkeley.edu:/pub/multimedia/mpeg/mpeg_encode-1.0.tar.Z.
- Contact: mpeg-bugs@cs.berkeley.edu
-
- toe.cs.berkeley.edu:/pub/multimedia/mpeg/vmpeg10.zip
- decel.ecel.uwa.edu.au:/users/michael/mpegw32e.zip (for Windows and NT)
-
- nvr.com:/pub/NVR-software/Product-1.0.4.tar.Z (192.82.231.50)
- (free demo copy of NVR's software toolkit for SPARCstations)
- Contact: Todd Brunhoff <toddb@nvr.com>
-
- H.261(P*64):
- havefun.stanford.edu:pub/p64/P64v1.2.alpha.tar.Z
- Contact: Andy Hung <achung@cs.stanford.edu> (see item 20 below)
-
- zenon.inria.fr:/rodeo/ivs/ivs3.3-src.tar.Z (Inria videoconference system)
- Contact: Thierry Turletti <turletti@sophia.inria.fr> (see item 20 below).
-
- JBIG:
- nic.funet.fi:/pub/graphics/misc/test-images/jbig.tar.gz.
-
- epic: (pyramid wavelet coder, see item 72)
- whitechapel.media.mit.edu:/pub/epic.tar.Z [18.85.0.125]
- Contact: Eero P. Simoncelli <eero@media.mit.edu>
- The "Lenna" test image is available as part of the EPIC package,
- where it is named "test_image".
-
- hcompress: (wavelet impage compression, see item 72)
- stsci.edu:/software/hcompress/hcompress.tar.Z
-
- wavethresh: (wavelet software for the language S)
- gdr.bath.ac.uk:/pub/masgpn/wavethresh2.2.Z
- Contact: gpn@maths.bath.ac.uk
-
- rice-wlet: (wavelet software, see item 72)
- cml.rice.edu:/pub/dsp/software/rice-wlet-tools.tar.Z
-
- compfits:
- uwila.cfht.hawaii.edu:/pub/compfits/compfits.tar.Z [128.171.80.50]
- Contact: Jim Wright <jwright@cfht.hawaii.edu>
-
- fitspress:
- cfata4.harvard.edu:/pub/fitspress08.tar.Z [128.103.40.79]
-
- tiff:
- For source and sample images, see question 18 below.
-
- DCT algorithms:
- etro.vub.ac.be:/pub/DCT_ALGORITHMS/*
- Contact: Charilos Christopoulos <chchrist@etro2.vub.ac.be>
-
-
- For fractal compression programs, see item 17 below.
- For image compression hardware, see item 85 in part 3 of this FAQ.
-
- ------------------------------------------------------------------------------
-
- Subject: [16] What is the state of the art in lossless image compression?
-
-
- The current state-of-the-art is the JBIG algorithm. For an
- introduction to JBIG, see question 74 in part 2.
-
- JBIG works best on bi-level images (like faxes) and also works well on
- Gray-coded grey scale images up to about six or so bits per pixel. You
- just apply JBIG to the bit planes individually. For more bits/pixel,
- lossless JPEG provides better performance, sometimes. (For JPEG, see
- question 19 below.)
-
- You can find a description of JBIG in ISO/IEC CD 11544, contained in
- document ISO/IEC JTC1/SC2/N2285. The only way to get it is to ask
- your National Standards Body for a copy. In the USA, call ANSI at
- (212) 642-4900.
-
- ------------------------------------------------------------------------------
-
- Subject: [17] What is the state of fractal compression?
-
- It is recommended to read first item 77 in part 2 of this FAQ:
- "Introduction to Fractal compression".
-
-
- from Tal Kubo <kubo@zariski.harvard.edu>:
-
- According to Barnsley's book 'Fractals Everywhere', this method is
- based on a measure of deviation between a given image and its
- approximation by an IFS code. The Collage Theorem states that there is
- a convergent process to minimize this deviation. Unfortunately,
- according to an article Barnsley wrote for BYTE a few years ago, this
- convergence was rather slow, about 100 hours on a Cray, unless assisted by
- a person.
-
- Barnsley et al are not divulging any technical information beyond the
- meager bit in 'Fractals Everywhere'. The book explains the idea of IFS
- codes at length, but is vague about the application of the Collage theorem
- to specific compression problems.
-
- There is reason to believe that Barnsley's company has
- *no algorithm* which takes a given reasonable image and achieves
- the compression ratios initially claimed for their fractal methods.
- The 1000-to-1 compression advertised was achieved only for a 'rigged'
- class of images, with human assistance. The best unaided
- performance I've heard of is good lossy compression of about 80-1.
-
- Steve Tate <srt@duke.cs.duke.edu> confirms:
-
- Compression ratios (unzoomed) seem to range from 20:1 to 60:1... The
- quality is considerably worse than wavelets or JPEG on most of the
- non-contrived images I have seen.
-
- But Yuval Fisher <fisher@inls1.ucsd.edu> disagrees:
-
- Their performance has improved dramatically beyond what they were
- talking about in BYTE a few years ago. Human assistance to the
- compression is no longer needed and the compression time is
- reasonable, although the more time and compute power you throw at the
- compression, the smaller the resulting file for the same level of
- quality.
-
- Geoffrey A Stephenson <ketlux@ketlux.demon.co.uk> adds:
-
- Iterated systems are shipping a general purpose compressor at about
- 300 Pounds in the UK that claims "640x480 24 bit colour compression of
- about 1 min at 922k -> 10k on a 486/50 software only, decomp. to 8
- bits in 3 secs, etc." At a recent multimedia conference in London they
- handed out free demo disks that show the decomp. in action. The
- package runs under both DOS anf WIN (DLLs provided for use in
- applications). They also sell a board to speed up compression and
- offer versions supporting full motion video (but not apparently at all
- SVGA sizes like the static picture version). I have not yet got my
- hands on a full version to test different types of pictures, but
- friends have a and claim it looks good.
-
-
- Thomas W. Colthurst <thomasc@athena.mit.edu> clarifies the distinction
- between IFS and the Fractal Transform:
-
- It is time, once and for all, to put to death the Barnsley myth that
- IFSs are good for image compression. They are not. Various algorithms
- have been proposed for this "inverse problem" ranging from the trendy
- (genetic algorithms) to the deep (moment methods) to the ad hoc (the
- hungry algorithm) to the absurd (the so-called "graduate student
- algorithm", consisting of locking up a grad student in a tiny office
- with a SGI workstation and not letting them out until they come up
- with a good IFS for your image). They are all useless for practical
- image compression.
-
- In fact, there are even good theoretical reasons for believing that
- IFSs will never be useful for image compression. For example, even
- if you have an IFS for object A and an IFS for object B, there is no
- way to combine these IFSs to get an IFS for object A union B or
- object A intersect B.
-
- Even Barnsley himself admits, in his latest book, that he doesn't use
- IFS image compression. Instead, he uses the so-called "fractal
- transform," which is really just a variant of vector quantization
- where you use the image itself, sampled at a higher scale, as the
- VQ codebook. To be fair, the fractal transform can be analyzed using
- local IFSs, but local IFSs are immensely more complicated and general
- than normal IFSs, to the point where one feels suspect even using the
- word "IFS" to describe them.
-
- It should be emphasized that the fractal transform is a real, working
- method that performs about as well as other existing methods like VQ
- or the discrete cosine transform. The fractal transform will probably
- never beat vector quantization (VQ) as for size of the compressed
- image, but does have the advantage that you don't need to carry your
- codebook around. The latest results have it slightly winning over
- the discrete cosine transform; only time and more research will tell
- if this advantage persists. Just like VQ, the fractal transform
- takes a while to compress, but is quick at decompression (Barnsley's
- company has hardware to do this in realtime).
-
- In short, IFSs are good for just about everything fractals are (and
- more!), but are absolutely horrid for image compression.
-
-
- Programs:
-
- A fractal image compression program is available by ftp in
- lyapunov.ucsd.edu:/pub/young-fractal/unifs10.zip. (Unix users, See
- item 2 above for unzip on Unix.) Note the file size before you ftp it:
- 1.2 MB. The package contains source for compression and decompression,
- source for X-windows decompression, MSDOS executables and images.
- A newer version of the program is in yuvpak20.zip.
-
- A fractal image decompression program (note: decompression only) is
- available in /pub/inls-ucsd/fractal-2.0.tar on on the same ftp site
- (lyapunov.ucsd.edu). Note the file size before you ftp it: 1.3 MB.
- This file also contains a paper by Yuval Fisher (see reference below),
- and some executables and sample images. Reading this paper is required
- to understand how the Young compression programs (see above) works.
-
- A note from Yuval Fisher <yfisher@ucsd.edu>:
-
- Ftp to legendre.ucsd.edu and look in pub/Research/Fisher. There
- is information there on my new book of contributed articles on
- fractal image compression, as well as the book's table of
- contents, some C code to encode and decode raw byte files of any
- size using a quadtree method, a manual explaining the use of the
- code, a fractal image compression bibliography (not guaranteed to
- be complete or close to it), some better executable code with
- sample encodings, and the SIGGRAPH '92 course notes on fractal
- image compression (these are based on appendix A of Chaos and
- Fractals by Peitgen et al., Springer Verlag).
-
-
- The source code for the program published in the Oct 93 issue of
- Byte is in ftp.uu.net:/published/byte/93oct/fractal.exe. This is
- self-extractible zip file (use "unzip fractal.exe" to extract on
- non MSDOS systems). The source code is for a TARGA video board.
-
- Another fractal compression program is available by ftp in
- vision.auc.dk:/pub/Limbo/Limbo*.tar.Z.
-
- References:
- A. Jacquin, 'Fractal image coding based on a theory of iterated
- contractive image transformations', Visual Comm. and Image
- Processing, vol SPIE-1360, 1990. (The best paper that explains
- the concept in a simple way.)
-
- A. Jacquin, "A Fractal Theory of Iterated Markov Operators with
- Applications to Digital Image Coding", PhD Thesis, Georgia Tech, 1989.
- It can be obtained from university microfilms for $35, phone 1-800-521-0600.
-
- M. Barnsley, L. Anson, "Graphics Compression Technology, SunWorld,
- October 1991, pp. 42-52.
- M.F. Barnsley, A. Jacquin, F. Malassenet, L. Reuter & A.D. Sloan,
- 'Harnessing chaos for image synthesis', Computer Graphics,
- vol 22 no 4 pp 131-140, 1988.
- M.F. Barnsley, A.E. Jacquin, 'Application of recurrent iterated
- function systems to images', Visual Comm. and Image Processing,
- vol SPIE-1001, 1988.
- A. Jacquin, "Image Coding Based on a Fractal Theory of Iterated Contractive
- Image Transformations" p.18, January 1992 (Vol 1 Issue 1) of IEEE Trans
- on Image Processing.
- A.E. Jacquin, 'A novel fractal block-coding technique for digital
- images', Proc. ICASSP 1990.
- G.E. Oien, S. Lepsoy & T.A. Ramstad, 'An inner product space
- approach to image coding by contractive transformations',
- Proc. ICASSP 1991, pp 2773-2776.
- D.S. Mazel, Fractal Modeling of Time-Series Data, PhD Thesis,
- Georgia Tech, 1991. (One dimensional, not pictures)
- S. A. Hollatz, "Digital image compression with two-dimensional affine
- fractal interpolation functions", Department of Mathematics and
- Statistics, University of Minnesota-Duluth, Technical Report 91-2.
- (a nuts-and-bolts how-to-do-it paper on the technique)
- Stark, J., "Iterated function systems as neural networks",
- Neural Networks, Vol 4, pp 679-690, Pergamon Press, 1991.
- Monro D M and Dudbridge F, "Fractal block coding of images",
- Electronics Letters 28(11):1053-1054 (1992)
- Beaumont J M, "Image data compression using fractal techniques",
- British Telecom Technological Journal 9(4):93-108 (1991)
- Fisher Y, "Fractal image compression", Siggraph 92
- Graf S, "Barnsley's Scheme for the Fractal Encoding of Images",
- Journal Of Complexity, V8, 72-78 (1992).
- Monro D.M. 'A hybrid fractal transform', Proc ICASSP 93, pp. V: 169-72
- Monro D.M. & Dudbridge F. 'Fractal approximation of image blocks',
- Proc ICASSP 92, pp. III: 485-488
- Monro D.M., Wilson D., Nicholls J.A. 'High speed image coding with the Bath
- Fractal Transform', IEEE International Symposium on Multimedia Technologies
- Southampton, April 1993
- Jacobs, E.W., Y. Fisher and R.D. Boss. "Image Compression: A study
- of the Iterated Transform Method." _Signal Processing 29_ (1992) 25-263
- Vrscay, Edward R. "Iterated Function Systems: Theory, Applications,
- and the Inverse Problem." _Fractal Geometry and Analysis_,
- J. Belair and S. Dubuc (eds.) Kluwer Academic, 1991. 405-468.
-
- Books:
- The Fractal Transform,
- Michael F. Barnsley and Louisa F. Anson
- ISBN 0-86720-218-1, ca. 250 pp, $49.95
-
- Fractal Image Compression
- Michael F. Barnsley and Lyman P. Hurd
- ISBN 0-86720-457-5, ca. 250 pp., $49.95
- Copies can be ordered directly from the publisher by sending a message
- to kpeters@math.harvard.edu with name, address and a Mastercard or
- Visa card number with expiration date.
-
- Barnsley's company is:
-
- Iterated Systems, Inc.
- 5550A Peachtree Parkway, Suite 650
- Norcross, GA 30092
- tel: 404-840-0310 or 1-800-4FRACTL
- fax: 404-840-0806
- In UK: Phone (0734) 880261, Fax (0734) 880360
-
- ------------------------------------------------------------------------------
-
- Subject: [18] I need specs and source for TIFF and CCITT group 4 Fax
-
-
- Specs for Group 3 and 4 image coding (group 3 is very similar to group 4)
- are in CCITT (1988) volume VII fascicle VII.3. They are recommendations
- T.4 and T.6 respectively. There is also an updated spec contained in 1992
- recommendations T.1 to T.6.
-
- CCITT specs are available by anonymous ftp (see above answer on
- V.42bis). The T.4 and T.6 specs are on src.doc.ic.ac.uk in directory
- /computing/ccitt/ccitt-standards/ccitt/1988/ascii, files 7_3_01.txt.Z and
- 7_3_02.txt.Z respectively.
-
- The following paper covers T.4, T.6 and JBIG:
-
- "Review of standards for electronic imaging for facsimile systems"
- in Journal of Electronic Imaging, Vol. 1, No. 1, pp. 5-21, January 1992.
-
-
- Source code can be obtained as part of a TIFF toolkit - TIFF image
- compression techniques for binary images include CCITT T.4 and T.6:
-
- sgi.com:/graphics/tiff/v3.2.tar.Z [192.48.153.1]
- Contact: sam@sgi.com
-
- There is also a companion compressed tar file (v3.0pics.tar.Z) that
- has sample TIFF image files. A draft of TIFF 6.0 is in TIFF6.ps.Z.
- Concerning JPEG compression in TIFF 6.0, Tom Lane <tgl+@cs.cmu.edu> adds:
-
- The TIFF document won't do you much good unless you also have the official
- JPEG standard. You can buy it from ANSI or your national ISO member
- organization (DIN over there, I suppose). [See also the book by Pennebaker
- and Mitchell referenced in item 75 of this FAQ.]
-
- Worse, the TIFF 6.0 spec has serious problems in its JPEG features. It is
- probable that section 22 (JPEG) will be rewritten from scratch. If you are
- considering implementing TIFF/JPEG, please contact me at tgl+@cs.cmu.edu for
- the latest word.
-
- Software for reading and writing CCITT Group 3 and 4 images is
- also available in directory merry.cs.monash.edu.au:/pub/alanf/TIFF_FAX
- (130.194.67.101). Contact: Alan Finlay <alanf@bruce.cs.monash.edu.au>.
-
-
- See also question 54 below.
-
- ------------------------------------------------------------------------------
-
- Subject: [19] What is JPEG?
-
-
- JPEG (pronounced "jay-peg") is a standardized image compression mechanism.
- JPEG stands for Joint Photographic Experts Group, the original name of the
- committee that wrote the standard. JPEG is designed for compressing either
- full-color or gray-scale digital images of "natural", real-world scenes.
- It does not work so well on non-realistic images, such as cartoons or line
- drawings.
-
- JPEG does not handle black-and-white (1-bit-per-pixel) images, nor does it
- handle motion picture compression. Standards for compressing those types
- of images are being worked on by other committees, named JBIG and MPEG
- respectively.
-
- Regular JPEG is "lossy", meaning that the image you get out of decompression
- isn't quite identical to what you originally put in. The algorithm achieves
- much of its compression by exploiting known limitations of the human eye,
- notably the fact that small color details aren't perceived as well as small
- details of light-and-dark. Thus, JPEG is intended for compressing images that
- will be looked at by humans. If you plan to machine-analyze your images, the
- small errors introduced by JPEG may be a problem for you, even if they are
- invisible to the eye. The JPEG standard includes a separate lossless mode,
- but it is not widely used and does not give nearly as much compression as the
- lossy mode.
-
- Question 75 "Introduction to JPEG" (in part 2 of this FAQ) gives an overview
- of how JPEG works and provides references for further reading. Also see the
- JPEG FAQ article, which covers JPEG software and usage hints. The JPEG FAQ is
- posted regularly in news.answers by Tom Lane <tgl+@cs.cmu.edu>. (See question
- 53 "Where are FAQ lists archived" if this posting has expired at your site.)
-
- For JPEG software, see item 15 above.
- For JPEG hardware, see item 85 in part 3 of this FAQ.
-
- ------------------------------------------------------------------------------
-
- Subject: [20] I am looking for source of an H.261 codec and MPEG
-
-
- The H.261 spec is available on src.doc.ic.ac.uk in
- /computing/ccitt/standards/ccitt/1992/h/h261.doc.Z (or h261.rtf.Z).
-
-
- For H.261 hardware, see item 85 in part 3 of this FAQ.
-
- from Thierry TURLETTI <turletti@sophia.inria.fr>:
-
- IVS (INRIA VIDEOCONFERENCING SYSTEM)
- - X11-based videoconferencing tool for SPARC, HP, DEC and
- Silicon Graphic workstations.
-
- ivs allows users to conduct multi-host audio and video
- conferences over the Internet. ivs requires a workstation
- with a screen with 1, 4, 8 or 24 bits depth. Multi-host
- conferences require that the kernel support multicast IP
- extensions (RFC 1112).
-
- On video input, video frames are grabbed by the VideoPix,
- SunVideo or Parallax boards for SparcStations or Raster Rops
- board for HP stations or the IndigoVideo board for SGI IRIS
- Indigo workstations. or the VIDEOTX board for DEC stations.
- No special hardware apart from the workstation's build-in
- audio hardware is required for audio conference.
-
- Video encoding is done according to the H.261 standard.
- The video stream can be encoded in either Super CIF
- (704x576 pixels) format or CIF (352x288 pixels) format or
- QCIF (176x144 pixels). Default format is CIF.
-
- Sources, binaries & manuals are freely available by anonymous
- ftp from zenon.inria.fr in the rodeo/ivs directory. An INRIA
- report describing this application is also available in the
- same directory.
-
- If you ftp & use this package, please send all remarks or
- modifications made to <turletti@sophia.inria.fr>. If you want
- to be added or deleted to the ivs-users mailing list, please send
- e-mail to ivs-users-request@sophia.inria.fr.
-
-
- from Andy Hung <achung@cs.stanford.edu>:
-
- Public domain UNIX C source code to do both image and image sequence
- compression and decompression is available by anonymous ftp:
-
- MPEG-I havefun.stanford.edu:pub/mpeg/MPEGv1.2.alpha.tar.Z
- CCITT H.261(P*64) havefun.stanford.edu:pub/p64/P64v1.2.alpha.tar.Z
- JPEG havefun.stanford.edu:pub/jpeg/JPEGv1.2.beta.tar.Z
-
- These codecs operate on raw raster scanned images.
-
- A software program to display raw raster-scanned YUV images and image
- sequences on X grayscale or color monitors is provided by a program in
- the anonymous ftp directory havefun.stanford.edu pub/cv/CVv1.1.tar.Z.
- If you are using the codecs above, we recommend that you ftp this file
- over as well.
-
- The source code has been compiled on DEC and SUN workstations.
- Caution: the P64 codec has not been tested compliant (any available
- p64 video streams would be much appreciated - please let us know at
- achung@cs.stanford.edu). The other codecs have been tested with
- streams from other encoders.
-
- We also have some IPB MPEG-I video coded streams in pub/mpeg/*.mpg;
- and P64 video streams in pub/p64/*.p64 that we have generated using
- our codecs.
-
- For a more complete description see the file
- havefun.stanford.edu:pub/README.
-
- ------------------------------------------------------------------------------
-
- Subject: [25] Fast DCT (Discrete Cosine Transform) algorithms
-
-
- Many image compression methods, including the JPEG, MPEG, and H.261 standards,
- are based on the discrete cosine transform. A good overall introduction to
- DCT is the book "Discrete Cosine Transform---Algorithms, Advantages,
- Applications" by K.R. Rao and P. Yip (Academic Press, London, 1990).
- This has an extensive, though already dated, bibliography.
-
- Here are some newer references provided by Tom Lane <tgl+@cs.cmu.edu>.
- Most of these are in IEEE journals or conference proceedings, notably
- ICASSP = IEEE Intl. Conf. on Acoustics, Speech, and Signal Processing.
- ICCAS = IEEE Intl. Conf. on Circuits and Systems.
- DCC = Data Compression Conference.
-
- Polynomial Transform Computation of the 2-D DCT, Duhamel & Guillemot,
- ICASSP '90 p. 1515.
- A Forward-Mapping Realization of the Inverse DCT, McMillan & Westover,
- DCC '92 p. 219.
- A Fast Algorithm for 2-D DCT, Cho, Yun & Lee, ICASSP '91 p. 2197.
- Fast Algorithm and Implementation of 2-D DCT, Cho & Lee, Tr. CAS v38 p. 297.
- A DCT Chip based on a new Structured and Computationally Efficient DCT
- Algorithm, Duhamel, Guillemot & Carlach, ICCAS '90 p. 77.
- Trade-offs in the Computation of Mono- and Multi-dimensional DCTs,
- Vetterli, Duhamel & Guillemot, ICASSP '89 p. 999.
- Practical Fast 1-D DCT Algorithms with 11 Multiplications,
- Loeffler, Ligtenberg & Moschytz, ICASSP '89 p. 988.
- New Scaled DCT Algorithms for Fused Multiply/Add Architectures,
- Linzer & Feig, ICASSP '91 p. 2201.
- Fast Algorithms for the 2-D Discrete Cosine Transform, Kamangar & Rao,
- IEEE Tr. Computers, v C-31 p. 899.
- Fast 2-D Discrete Cosine Transform, Vetterli, ICASSP '85 p. 1538.
- A Two-Dimensional Fast Cosine Transform, Haque, Tr. ASSP v ASSP-33 p. 1532.
- Real-Time Parallel and Fully Pipelined 2-D DCT Lattice Structures with
- Application to HDTV Systems, Chiu & Liu, Tr. CAS for Video Tech, v 2 p. 25.
- J.F. Blinn, "What's the Deal with the DCT", IEEE Computer Graphics and
- Applications, July 1993, pp.78-83.
-
- The free JPEG code (jpegsrc.v4.tar.Z) has one of the fastest implementations
- of the DCT code. It's all in the files jfwddct.c and jrevdct.c (which do
- the dct and idct, respectively). See item 15 for ftp locations.
-
- ------------------------------------------------------------------------------
-
- Subject: [26] Are there algorithms and standards for audio compression?
-
-
- Yes. See the introduction to MPEG given in part 2 of this FAQ.
-
- A lossless compressor for 8bit and 16bit audio data (.au) is available by
- anonymous ftp at svr-ftp.eng.cam.ac.uk:/comp.speech/sources/shorten-1.11.tar.Z.
- An MSDOS executable is in shn109.exe. Shorten works by using Huffman
- coding of prediction residuals. Compression is generally better than
- that obtained by applying general purpose compression utilities to
- audio files. Shorten version 1.11 also supports lossy compression.
- Contact: Tony Robinson <ajr@dsl.eng.cam.ac.uk>.
-
- An MPEG audio player is available on sunsite.unc.edu in
- /pub/electronic-publications/IUMA/audio_utils/mpeg_players/Workstations,
- file maplay.tar.Z. The sources of the XING MPEG audio player for Windows
- are also there (sunsite.unc.edu) in
- /pub/electronic-publications/IUMA/audio_utils/mpeg_players/Windows/mpgaudio.zip
-
-
- Copied from the comp.dsp FAQ posted by guido@cwi.nl (Guido van Rossum):
-
- Strange though it seems, audio data is remarkably hard to compress
- effectively. For 8-bit data, a Huffman encoding of the deltas between
- successive samples is relatively successful. For 16-bit data,
- companies like Sony and Philips have spent millions to develop
- proprietary schemes.
-
- Public standards for voice compression are slowly gaining popularity,
- e.g. CCITT G.721 and G.723 (ADPCM at 32 and 24 kbits/sec). (ADPCM ==
- Adaptive Delta Pulse Code Modulation.) Free source code for a *fast*
- 32 kbits/sec ADPCM (lossy) algorithm is available by ftp from ftp.cwi.nl
- as /pub/adpcm.shar. (** NOTE: if you are using v1.0, you should get
- v1.1, released 17-Dec-1992, which fixes a serious bug -- the quality
- of v1.1 is claimed to be better than uLAW **)
-
- (Note that U-LAW and silence detection can also be considered
- compression schemes.)
-
-
- You can get a G.721/722/723 package by email to teledoc@itu.arcom.ch, with
-
- GET ITU-3022
-
- as the *only* line in the body of the message.
-
-
- A note on u-law from Markus Kuhn <mskuhn@immd4.informatik.uni-erlangen.de>:
-
- u-law (more precisely (greek mu)-law or 5-law if you have an 8-bit
- ISO terminal) is more an encoding then a compression method,
- although a 12 to 8 bit reduction is normally part of the encoding.
- The official definition is CCITT recommendation G.711. If you want
- to know how to get CCITT documents, check the Standards FAQ
- posted to news.answers or get the file standards-faq by ftp in
- directory rtfm.mit.edu:/pub/usenet/news.answers.
-
-
- See also the comp.dsp FAQ for more information on:
-
- - The U.S. DoD's Federal-Standard-1016 based 4800 bps code excited linear
- prediction voice coder version 3.2a (CELP 3.2a)
- - The U.S. DoD's Federal-Standard-1015/NATO-STANAG-4198 based 2400 bps
- linear prediction coder version 53 (LPC-10e v53)
- - Realtime DSP code and hardware for FS-1015 and FS-1016
-
- You can find the comp.dsp FAQ in comp.dsp or news.answers with subject:
- "FAQ: Audio File Formats" or by ftp on rtfm.mit.edu
- in /pub/usenet/news.answers/audio-fmts/part1.
-
-
- CELP C code for Sun SPARCs is available for anonymous ftp at
- furmint.nectar.cs.cmu.edu, in directory celp.audio.compression.
- Version 3.2a is also in super.org:/pub/celp_3.2a.tar.Z.
-
-
- Recommended reading:
- Digital Coding of Waveforms: Principles and Applications to Speech and
- Video. N. S. Jayant and Peter Noll. Prentice-Hall, 1984, ISBN
- 0-13-211913-7.
-
-
- from Markus Kuhn <mskuhn@immd4.informatik.uni-erlangen.de>:
-
- One highest quality sound compression format is called ASPEC and has
- been developped by a team at the Frauenhofer Institut in Erlangen (Germany)
- and others.
-
- ASPEC produces CD like quality and offers several bitrates, one is
- 128 kbit/s. It is a lossy algorithm that throws away frequencys that
- aren't registered in the human cochlea in addition to sophisticated
- entropy coding. The 64 kbit/s ASPEC variant might soon bring hifi
- quality ISDN phone connections. It has been implemented on standard DSPs.
-
- The Layer 3 MPEG audio compression standard now contains what is officially
- called the best parts of the ASPEC and MUSICAM algorithms. A reference is:
-
- K.Brandenburg, G.Stoll, Y.F.Dehery, J.D.Johnston, L.v.d.Kerkhof,
- E.F.Schroeder: "The ISO/MPEG-Audio Codec: A Generic Standard for Coding
- of High Quality Digital Audio",
- 92nd. AES-convention, Vienna 1992, preprint 3336
-
-
- from Jutta Degener <jutta@cs.tu-berlin.de> and Carsten Bormann
- <cabo@cs.tu-berlin.de>:
-
- GSM 06.10 13 kbit/s RPE/LTP speech compression available
- --------------------------------------------------------
-
- The Communications and Operating Systems Research Group (KBS) at the
- Technische Universitaet Berlin is currently working on a set of
- UNIX-based tools for computer-mediated telecooperation that will be
- made freely available.
-
- As part of this effort we are publishing an implementation of the
- European GSM 06.10 provisional standard for full-rate speech
- transcoding, prI-ETS 300 036, which uses RPE/LTP (residual pulse
- excitation/long term prediction) coding at 13 kbit/s.
-
- GSM 06.10 compresses frames of 160 13-bit samples (8 kHz sampling
- rate, i.e. a frame rate of 50 Hz) into 260 bits; for compatibility
- with typical UNIX applications, our implementation turns frames of 160
- 16-bit linear samples into 33-byte frames (1650 Bytes/s).
- The quality of the algorithm is good enough for reliable speaker
- recognition; even music often survives transcoding in recognizable
- form (given the bandwidth limitations of 8 kHz sampling rate).
-
- Version 1.0 of the implementation is available per anonymous ftp from
- tub.cs.tu-berlin.de as /pub/tubmik/gsm-1.0.tar.Z. Questions and bug
- reports should be directed to toast@tub.cs.tu-berlin.de.
- Note that the distribution is not available via E-mail (please use one
- of the ftp-via-E-mail servers).
-
-
- from Bob Kimball <rkimball@qualcomm.com>:
-
- I work for Qualcomm Inc. and we are designing a digital cellular telephone
- system. Our phone uses our variable rate vocoder (QCELP) which is designed
- for speach and compresses 64Kb/s speach to 8Kb/s through 1Kb/s with 8Kb/s
- being full rate and 1Kb/s for 1/8 rate speach. It works great for speach.
-
- The QCELP process is documented in our Common Air Interface (CAI) which is
- available for anonymous ftp from lorien.qualcomm.com in /pub/cdma
- each chapter is a postscript file. The vocoder is described in appendix A.
- The whole document is quite large. This is the document which is currently
- going through the TIA standard committee so it is not a final version. The
- appendix on the vocoder should be almost identical to the final version...
- whenever that comes out.
-
-
- from Nicola Ferioli <ser1509@cdc835.cdc.polimi.it>:
-
- oak.oakland.edu:/pub/msdos/sound/vocpak20.zip
- Lossless 8-bit sound file compressor
-
- VOCPACK is a compressor/decompressor for 8-bit digital sound using a
- lossless algorithm; it is useful to save disk space without degrading
- sound quality. It can compress signed and unsigned data, sampled at any
- rate, mono or stereo. Since the method used is not lossy, it isn't
- necessary to strip file headers before compressing.
-
- VOCPACK was developed for use with .VOC (SoundBlaster) and .WAV (Windows)
- files, but any 8-bit sound can be compressed since the program takes no
- assumptions about the file structure.
-
- The typical compression ratio obtained goes from 0,8 for files sampled at
- 11 KHz to 0,4 for 44 Khz files. The best results are obtained with 44 KHz
- sounds (mono or stereo): general-purpose archivers create files that can be
- twice longer than the output of VOCPACK. You can obtain smaller values
- using lossy compressors but if your goal is to keep the sound quality
- unaltered you should use a lossless program like VOCPACK.
-
- ------------------------------------------------------------------------------
-
- Subject: [30] My archive is corrupted!
-
-
- The two most common reasons for this are
-
- (1) failing to use the magic word "tenex" (when connected to SIMTEL20 and
- other TOPS20 systems) or "binary" (when connected to UNIX systems) when
- transferring the file from an ftp site to your host machine. The
- reasons for this are technical and boring. A synonym for "tenex" is
- "type L 8", in case your ftp doesn't know what "tenex" means.
-
- (2) failing to use an eight-bit binary transfer protocol when transferring
- the file from the host to your PC. Make sure to set the transfer type
- to "binary" on both your host machine and your PC.
-
- ------------------------------------------------------------------------------
-
- Subject: [31] pkunzip reports a CRC error!
-
-
- The portable zip 1.1 contains many workarounds for undocumented restrictions
- in pkunzip. Compatibility is ensured for pkunzip 1.10 only. All previous
- versions (pkunzip 1.0x) have too many bugs and cannot be supported. This
- includes Borland unzip.
-
- So if your pkunzip reports a CRC error, check that you are not using
- an obsolete version. Get either pkzip 2.04g or unzip 5.0p1 (see question
- 2 above for ftp sites). To generate zip files compatible with pkunzip 1.10,
- use zip 1.1 (see item 2 above for ftp site).
-
- ------------------------------------------------------------------------------
-
- Subject: [32] VMS zip is not compatible with pkzip!
-
-
- The problem is most likely in the file transfer program.
-
- Many use kermit to transfer zipped files between PC and VMS VAX. The
- following VMS kermit settings make VMS-ZIP compatible with PKZIP:
-
- VMS kermit PC kermit
- --------------- --------------
-
- Uploading PKZIPped file to be UNZIPped: set fi ty fixed set fi ty bi
- Downloading ZIPped file to be PKUNZIPped: set fi ty block set fi ty bi
-
- If you are not using kermit, transfer a file created by pkzip on MSDOS
- to VMS, transfer it back to your PC and check that pkunzip can extract it.
-
- ------------------------------------------------------------------------------
-
- Subject: [33] I have a problem with Stacker or DoubleSpace!
-
-
- The newsgroup comp.compression is *not* the appropriate place to
- discuss about one specific program on one specific operating system.
- Since you have bought a legal copy of Stacker or MSDOS 6.x, you have the
- documentation of your product; please read it. If you can't find the
- answer in the documentation, please report the problem to the Stac
- or Microsoft customer support. If you really feel that the net has to
- know about your problem, please post in one of the MSDOS newsgroups,
- such as comp.os.msdos.apps or comp.binaries.ibm.pc.d.
-
- ------------------------------------------------------------------------------
-
- Subject: [50] What is this 'tar' compression program?
-
-
- tar is not a compression program. It just combines several files
- into one, without compressing them. tar file are often compressed with
- 'compress', resulting in a .tar.Z file. See question 2, file type .tar.Z.
- GNU tar has the capability to (de)compress files as well.
-
- When you have to archive a lot of very small files, it is often
- preferable to create a single .tar file and compress it, than to
- compress the individual files separately. The compression program can
- thus take advantage of redundancy between separate files. The
- disadvantage is that you must uncompress the whole .tar file to
- extract any member. You can also improve compression by grouping
- files by type, as in:
-
- tar cvf - `ls | sort -t. +1` | gzip > file.tar.gz
-
- ------------------------------------------------------------------------------
-
- Subject: [51] I need a CRC algorithm
-
-
- As its name implies (Cyclic Redundancy Check) a crc adds redundancy
- whereas the topic of this group is to remove it. But since this
- question comes up often, here is some code (by Rob Warnock <rpw3@sgi.com>).
-
- The following C code does CRC-32 in BigEndian/BigEndian byte/bit order.
- That is, the data is sent most significant byte first, and each of the bits
- within a byte is sent most significant bit first, as in FDDI. You will need
- to twiddle with it to do Ethernet CRC, i.e., BigEndian/LittleEndian byte/bit
- order. [Left as an exercise for the reader.]
-
- The CRCs this code generates agree with the vendor-supplied Verilog models
- of several of the popular FDDI "MAC" chips.
-
- u_long crc32_table[256];
- /* Initialized first time "crc32()" is called. If you prefer, you can
- * statically initialize it at compile time. [Another exercise.]
- */
-
- u_long crc32(u_char *buf, int len)
- {
- u_char *p;
- u_long crc;
-
- if (!crc32_table[1]) /* if not already done, */
- init_crc32(); /* build table */
- crc = 0xffffffff; /* preload shift register, per CRC-32 spec */
- for (p = buf; len > 0; ++p, --len)
- crc = (crc << 8) ^ crc32_table[(crc >> 24) ^ *p];
- return ~crc; /* transmit complement, per CRC-32 spec */
- }
-
- /*
- * Build auxiliary table for parallel byte-at-a-time CRC-32.
- */
- #define CRC32_POLY 0x04c11db7 /* AUTODIN II, Ethernet, & FDDI */
-
- init_crc32()
- {
- int i, j;
- u_long c;
-
- for (i = 0; i < 256; ++i) {
- for (c = i << 24, j = 8; j > 0; --j)
- c = c & 0x80000000 ? (c << 1) ^ CRC32_POLY : (c << 1);
- crc32_table[i] = c;
- }
- }
-
- See also ftp.uni-erlangen.de:/pub/doc/ISO/english/async-HDLC, and the
- source of all archivers, such as the file makecrc.c in the sources of
- zip 2.0 (see item 2).
-
- ------------------------------------------------------------------------------
-
- Subject: [52] What about those people who continue to ask frequently asked
- questions in spite of the frequently asked questions document?
-
-
- Just send them a polite mail message, referring them to this document.
- There is no need to flame them on comp.compression. That would just
- add more noise to this group. Posted answers that are in the FAQ are
- just as annoying as posted questions that are in the FAQ.
-
- ------------------------------------------------------------------------------
-
- Subject: [53] Where are FAQ lists archived?
-
-
- Many are crossposted to news.answers. That newsgroup should have a
- long expiry time at your site; if not, talk to your sysadmin.
-
- FAQ lists are available by anonymous FTP from rtfm.mit.edu.
- The comp.compression FAQ that you are reading is in directory
- /pub/usenet/news.answers/compression-faq
-
- If you don't have FTP access, you can access the archives by mail
- server. Send an email message to mail-server@rtfm.mit.edu
- containing the commands
- send usenet/news.answers/compression-faq/part1
- send usenet/news.answers/compression-faq/part2
- send usenet/news.answers/compression-faq/part3
- For instructions, send an email message to the same address with the
- words "help" and "index" (no quotes) on separate lines. If you don't
- get a reply, check your return address, or add a line such as
- path myname@foo.edu
-
- ------------------------------------------------------------------------------
-
- Subject: [54] I need specs for graphics formats
-
-
- Have a look in directory /pub/graphics.formats on zamenhof.cs.rice.edu.
- It contains descriptions of gif, tiff, fits, etc...
-
- See also the FAQ list for comp.graphics. See item 53 for an ftp site.
-
- ------------------------------------------------------------------------------
-
- Subject: [55] Where can I find Lenna and other images?
-
-
- A bunch of standard images (lenna, baboon, cameraman, crowd, moon
- etc..) are on ftp site eedsp.gatech.edu (130.207.226.2) in directory
- /database/images. The images are in 256-level grayshades (256x256
- pixels, 256 "colors").
-
- [Note: the site ipl.rpi.edu mentioned below keeps changing. Images
- stay there for a while then disappear. They are again available at
- the time of writing (27 Dec 93).]
-
- The site ipl.rpi.edu (128.113.14.50) has standard images in two
- directories:
- ipl.rpi.edu:/pub/image/still/usc
- ipl.rpi.edu:/pub/image/still/canon
-
- (The directory /pub/image/sequence was taken offline because of
- possible copyright problems, but has come back again. In particular,
- Miss America is in subdirectories of /pub/image/sequence/missa.)
-
- In each of those directories are the following directories:
- bgr - 24 bit blue, green, red
- color - 24 bit red, green, blue
- gray - 8 bit grayscale uniform weighted
- gray601 - 8 bit grayscale CCIR-601 weighted
-
- And in these directories are the actual images.
-
- For example, the popular lena image is in
- ipl.rpi.edu:/pub/image/still/usc/color/lena # 24 bit RGB
- ipl.rpi.edu:/pub/image/still/usc/bgr/lena # 24 bit BGR
- ipl.rpi.edu:/pub/image/still/usc/gray/lena # 8 bit gray
-
- All of the images are in Sun rasterfile format. You can use the pbm
- utilities to convert them to whatever format is most convenient.
- [pbm is available in ftp.ee.lbl.gov:/pbmplus*.tar.Z].
- Questions about the ipl archive should be sent to help@ipl.rpi.edu.
-
-
- There are few gray-scale still images and some raw data of test results
- available in directory nic.funet.fi:/pub/graphics/misc/test-images.
-
- Medical images can be found in decaf.stanford.edu:/pub/images/medical/mri,
- eedsp.gatech.edu:/database/images/wchung/medical, and
- omicron.cs.unc.edu:/pub/softlab/CHVRTD.
-
-
- Rodney Peck <rodney@balltown.cma.com> is interested in some method
- of establishing a canonical ftp database of images but does not have
- the resources to provide an ftp site for that database. Send suggestions to
- rodney@balltown.cma.com.
-
-
- Beware: the same image often comes in many different forms, at
- different resolutions, etc... The original lenna image is 512 wide,
- 512 high, 8 bits per pel, red, green and blue fields. Gray-scale
- versions of Lenna have been obtained in two different ways from the
- original:
- (1) Using the green field as a gray-scale image, and
- (2) Doing an RGB->YUV transformation and saving the Y component.
- Method (1) makes it easier to compare different people's results since
- everyone's version should be the same using that method. Method (2)
- produces a more correct image.
-
- For the curious: 'lena' or 'lenna' is a digitized Playboy centerfold,
- from November 1972. (Lenna is the spelling in Playboy, Lena is the
- Swedish spelling of the name.) Lena Soderberg (ne Sjooblom) was last
- reported living in her native Sweden, happily married with three kids
- and a job with the state liquor monopoly. In 1988, she was
- interviewed by some Swedish computer related publication, and she was
- pleasantly amused by what had happened to her picture. That was the
- first she knew of the use of that picture in the computer business.
-
- The editorial in the January 1992 issue of Optical Engineering (v. 31
- no. 1) details how Playboy has finally caught on to the fact that
- their copyright on Lenna Sjooblom's photo is being widely infringed.
- It sounds as if you will have to get permission from Playboy to
- publish it in the future.
-
- The CCITT test images are available on nic.funet.fi in directory
- pub/graphics/misc/test-images, files ccitt1.tif to ccitt8.tif.
-
- Note on the CCITT test images, by Robert Estes <estes@eecs.ucdavis.edu>:
-
- The ccitt files are in ipl.rpi.edu:/image-archive/bitmap/ccitt
- (128.113.14.50). [Note from FAQ maintainer: this directory has
- now disappeared; ipl.rpi.edu is a very volatile ftp site :-).]
- They are named ccitt-n.ras.Z where n goes from 1 to 8.
- Each file has an accompanying doc file called ccitt-n.ras.doc which
- describes the image file. Here's the doc file for ccitt-1.ras:
-
- Name ccitt-1.ras
- Size 1728 x 2376 x 1
- Type 1 bit standard format sun rasterfile
- Keywords binary standard image 1 bit fax
- Description
- One of eight images from the standard binary CCITT test image set.
-
- This set is commonly used to compare binary image compression
- techniques. The images are are 1728x2376 pixels.
-
- ------------------------------------------------------------------------------
-
- Subject: [56] I am looking for a message digest algorithm
-
-
- Look on the ftp site rsa.com, in directory /pub. MD4 and MD5 are there.
- This question would be more appropriate on sci.crypt.
-
-
-
- End of part 1 of the comp.compression faq.
-